ATTENTION-INTEGRATED CONVOLUTIONAL NEURAL NETWORKS FOR ENHANCED IMAGE CLASSIFICATION: A COMPREHENSIVE THEORETICAL AND EMPIRICAL ANALYSIS
DOI:
https://doi.org/10.30890/2567-5273.2024-35-00-030Keywords:
attention-integrated convolutional networks, Image classification, Convolutional neural networks (CNNs), Attention mechanisms, Deep learning architecture, Feature representation, Computational efficiency, ImageNet classification, Neural network optimizatAbstract
This paper presents a novel deep learning architecture for image classification tasks, combining convolutional neural networks (CNNs) with attention mechanisms to improve accuracy and computational efficiency. The proposed model, called Attention-IntegraMetrics
No metrics found.
References
LeCun, Yann, Bengio, Yoshua, and Hinton, Geoffrey (2015). “Deep Learning”. In: Nature 521.7553, pp. 436–444.
Vaswani, Ashish, Shazeer, Noam, Parmar, Niki, Uszkoreit, Jakob, Jones, Llion, Gomez, Aidan N., Kaiser, Łukasz, and Polosukhin, Illia (2017). “Attention Is All You Need”. In: Advances in Neural Information Processing Systems 30, pp. 5998–6008.
Wang, Xiaolong, Girshick, Ross, Gupta, Abhinav, and He, Kaiming (2018). “Non-local Neural Networks”. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7794–7803.
LeCun, Yann, Bottou, Léon, Bengio, Yoshua, and Haffner, Patrick (1998). “Gradient-based Learning Applied to Document Recognition”. In: Proceedings of the IEEE 86.11, pp. 2278–2324.
Krizhevsky, Alex, Sutskever, Ilya, and Hinton, Geoffrey E. (2012). “ImageNet Classification with Deep Convolutional Neural Networks”. In: Advances in Neural Information Processing Systems 25, pp. 1097–1105.
Deng, Jia, Dong, Wei, Socher, Richard, Li, Li-Jia, Li, Kai, and Fei-Fei, Li (2009). “ImageNet: A Large-Scale Hierarchical Image Database”. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255.
Simonyan, Karen and Zisserman, Andrew (2014). “Very Deep Convolutional Networks for Large-Scale Image Recognition”. In: arXiv preprint arXiv:1409.1556.
Ioffe, Sergey and Szegedy, Christian (2015). “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift”. In: Proceedings of the 32nd International Conference on Machine Learning, pp. 448– 456.
Bahdanau, Dzmitry, Cho, Kyunghyun, and Bengio, Yoshua (2014). “Neural Machine Translation by Jointly Learning to Align and Translate”. In: arXiv preprint arXiv:1409.0473.
Hu, Jie, Shen, Li, and Sun, Gang (2018). “Squeeze-and-Excitation Networks”. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141.
Dosovitskiy, Alexey, Beyer, Lucas, Kolesnikov, Alexander, Weissenborn, Dirk, Zhai, Xiaohua, Unterthiner, Thomas, Dehghani, Mostafa, Minderer, Matthias, Heigold, Georg, Gelly, Sylvain, Uszkoreit, Jakob, and Houlsby, Neil (2020). “An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale”. In: arXiv preprint arXiv:2010.11929.
Woo, Sanghyun, Park, Jongchan, Lee, Joon-Young, and Kweon, In So (2018). “CBAM: Convolutional Block Attention Module”. In: Proceedings of the European Conference on Computer Vision, pp. 3–19.
Fu, Jun, Liu, Jing, Tian, Haijie, Li, Yong, Bao, Yongjun, Fang, Zhiwei, and Lu, Hanqing (2019). “Dual Attention Network for Scene Segmentation”. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3146–3154.
Luo, Wenjie, Li, Yujia, Urtasun, Raquel, and Zemel, Richard (2016). “Understanding the Effective Receptive Field in Deep Convolutional Neural Networks”. In: Advances in Neural Information Processing Systems 29, pp. 4898– 4906.
Ioffe, Sergey and Szegedy, Christian (2015). “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift”. In: Proceedings of the 32nd International Conference on Machine Learning, pp. 448– 456.
Krizhevsky, Alex (2009). Learning Multiple Layers of Features from Tiny Images. Tech. rep. University of Toronto.
Paszke, Adam, Gross, Sam, Massa, Francisco, Lerer, Adam, Bradbury, James, Chanan, Gregory, Killeen, Trevor, Lin, Zeming, Gimelshein, Natalia, Antiga, Luca, et al. (2019). “PyTorch: An Imperative Style, High-Performance Deep
Huang, Gao, Liu, Zhuang, Van Der Maaten, Laurens, and Weinberger, Kilian Q. (2017). “Densely Connected Convolutional Networks”. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Authors

This work is licensed under a Creative Commons Attribution 4.0 International License.