RECOGNITION OF TEXTUAL AND VISUAL INFORMATION USING A HYBRID SVD AND CNN MODEL

Authors

DOI:

https://doi.org/10.30890/2567-5273.2025-39-02-049

Keywords:

information recognition, convolutional neural network, singular value decomposition, hybrid model, multi-format data.

Abstract

The article explores the recognition of textual and visual information using a hybrid model that combines Singular Value Decomposition (SVD) with a Convolutional Neural Network (CNN). The first experiment was conducted on the MNIST dataset, where the base

References

Tan, M., & Le, Q. V. (2021). EfficientNetV2: Smaller models and faster training. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2104.00298

Thi, T. C. (2023). Singular value decomposition and applications in data processing and artificial intelligence. HPU2 Journal of Science Natural Sciences and Technology, 2(3), 34–41. https://doi.org/10.56764/hpu2.jos.2023.2.3.34-41

Chen, W., Yang, Y., Tian, Z., Chen, Q., & Liu, J. (2024). A review of multimodal learning for text to images. Multimedia Tools and Applications. https://doi.org/10.1007/s11042-024-19117-8

Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., & Sutskever, I. (2021). Learning transferable visual models from natural language supervision. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2103.00020

Li, Q., Peng, H., Li, J., Xia, C., Yang, R., Sun, L., Yu, P. S., & He, L. (2020). A survey on text classification: From shallow to Deep learning. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2008.00364

Sidheekh, S. (2021). Learning neural networks on SVD boosted latent spaces for semantic classification. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2101.00563

Zhao, X., Wang, L., Zhang, Y., Han, X., Deveci, M., & Parmar, M. (2024). A review of convolutional neural networks in computer vision. Artificial Intelligence Review, 57(4). https://doi.org/10.1007/s10462-024-10721-6

Santos, C. F. G. D., & Papa, J. P. (2022). Avoiding Overfitting: A survey on regularization methods for convolutional neural networks. ACM Computing Surveys, 54(10s), 1–25. https://doi.org/10.1145/3510413

Gao, J., Li, P., Chen, Z., & Zhang, J. (2020). A survey on deep learning for multimodal data fusion. Neural Computation, 32(5), 829–864. https://doi.org/10.1162/neco_a_01273

Jiao, T., Guo, C., Feng, X., Chen, Y., & Song, J. (2024). A comprehensive survey on Deep Learning Multi-Modal Fusion: Methods, Technologies and applications. Computers, Materials & Continua/Computers, Materials & Continua (Print), 80(1), 1–35. https://doi.org/10.32604/cmc.2024.053204

Hossain, M. S., Basak, N., Mollah, M. A., Nahiduzzaman, M., Ahsan, M., & Haider, J. (2025). Ensemble-based multiclass lung cancer classification using hybrid CNN-SVD feature extraction and selection method. PLoS ONE, 20(3), e0318219. https://doi.org/10.1371/journal.pone.0318219

MNIST Dataset. (2019). Kaggle. https://www.kaggle.com/datasets/hojjatk/mnist-dataset

PetFinder.my adoption prediction. (2019). Kaggle. https://www.kaggle.com/c/petfinder-adoption-prediction/data

Published

2025-06-30

How to Cite

Пелещак, І. (2025). RECOGNITION OF TEXTUAL AND VISUAL INFORMATION USING A HYBRID SVD AND CNN MODEL. Modern Engineering and Innovative Technologies, 2(39-02), 136–146. https://doi.org/10.30890/2567-5273.2025-39-02-049

Issue

Section

Articles