ATTENTION-INTEGRATED CONVOLUTIONAL NEURAL NETWORKS FOR ENHANCED IMAGE CLASSIFICATION: A COMPREHENSIVE THEORETICAL AND EMPIRICAL ANALYSIS

Authors

DOI:

https://doi.org/10.30890/2567-5273.2024-35-00-030

Keywords:

attention-integrated convolutional networks, Image classification, Convolutional neural networks (CNNs), Attention mechanisms, Deep learning architecture, Feature representation, Computational efficiency, ImageNet classification, Neural network optimizat

Abstract

This paper presents a novel deep learning architecture for image classification tasks, combining convolutional neural networks (CNNs) with attention mechanisms to improve accuracy and computational efficiency. The proposed model, called Attention-Integra

Metrics

Metrics Loading ...

References

LeCun, Yann, Bengio, Yoshua, and Hinton, Geoffrey (2015). “Deep Learning”. In: Nature 521.7553, pp. 436–444.

Vaswani, Ashish, Shazeer, Noam, Parmar, Niki, Uszkoreit, Jakob, Jones, Llion, Gomez, Aidan N., Kaiser, Łukasz, and Polosukhin, Illia (2017). “Attention Is All You Need”. In: Advances in Neural Information Processing Systems 30, pp. 5998–6008.

Wang, Xiaolong, Girshick, Ross, Gupta, Abhinav, and He, Kaiming (2018). “Non-local Neural Networks”. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7794–7803.

LeCun, Yann, Bottou, Léon, Bengio, Yoshua, and Haffner, Patrick (1998). “Gradient-based Learning Applied to Document Recognition”. In: Proceedings of the IEEE 86.11, pp. 2278–2324.

Krizhevsky, Alex, Sutskever, Ilya, and Hinton, Geoffrey E. (2012). “ImageNet Classification with Deep Convolutional Neural Networks”. In: Advances in Neural Information Processing Systems 25, pp. 1097–1105.

Deng, Jia, Dong, Wei, Socher, Richard, Li, Li-Jia, Li, Kai, and Fei-Fei, Li (2009). “ImageNet: A Large-Scale Hierarchical Image Database”. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255.

Simonyan, Karen and Zisserman, Andrew (2014). “Very Deep Convolutional Networks for Large-Scale Image Recognition”. In: arXiv preprint arXiv:1409.1556.

Ioffe, Sergey and Szegedy, Christian (2015). “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift”. In: Proceedings of the 32nd International Conference on Machine Learning, pp. 448– 456.

Bahdanau, Dzmitry, Cho, Kyunghyun, and Bengio, Yoshua (2014). “Neural Machine Translation by Jointly Learning to Align and Translate”. In: arXiv preprint arXiv:1409.0473.

Hu, Jie, Shen, Li, and Sun, Gang (2018). “Squeeze-and-Excitation Networks”. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141.

Dosovitskiy, Alexey, Beyer, Lucas, Kolesnikov, Alexander, Weissenborn, Dirk, Zhai, Xiaohua, Unterthiner, Thomas, Dehghani, Mostafa, Minderer, Matthias, Heigold, Georg, Gelly, Sylvain, Uszkoreit, Jakob, and Houlsby, Neil (2020). “An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale”. In: arXiv preprint arXiv:2010.11929.

Woo, Sanghyun, Park, Jongchan, Lee, Joon-Young, and Kweon, In So (2018). “CBAM: Convolutional Block Attention Module”. In: Proceedings of the European Conference on Computer Vision, pp. 3–19.

Fu, Jun, Liu, Jing, Tian, Haijie, Li, Yong, Bao, Yongjun, Fang, Zhiwei, and Lu, Hanqing (2019). “Dual Attention Network for Scene Segmentation”. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3146–3154.

Luo, Wenjie, Li, Yujia, Urtasun, Raquel, and Zemel, Richard (2016). “Understanding the Effective Receptive Field in Deep Convolutional Neural Networks”. In: Advances in Neural Information Processing Systems 29, pp. 4898– 4906.

Ioffe, Sergey and Szegedy, Christian (2015). “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift”. In: Proceedings of the 32nd International Conference on Machine Learning, pp. 448– 456.

Krizhevsky, Alex (2009). Learning Multiple Layers of Features from Tiny Images. Tech. rep. University of Toronto.

Paszke, Adam, Gross, Sam, Massa, Francisco, Lerer, Adam, Bradbury, James, Chanan, Gregory, Killeen, Trevor, Lin, Zeming, Gimelshein, Natalia, Antiga, Luca, et al. (2019). “PyTorch: An Imperative Style, High-Performance Deep

Huang, Gao, Liu, Zhuang, Van Der Maaten, Laurens, and Weinberger, Kilian Q. (2017). “Densely Connected Convolutional Networks”. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708.

Published

2024-10-30

How to Cite

Балашов, А. (2024). ATTENTION-INTEGRATED CONVOLUTIONAL NEURAL NETWORKS FOR ENHANCED IMAGE CLASSIFICATION: A COMPREHENSIVE THEORETICAL AND EMPIRICAL ANALYSIS. Modern Engineering and Innovative Technologies, 2(35-02), 18–27. https://doi.org/10.30890/2567-5273.2024-35-00-030

Issue

Section

Articles