Fine-Grained Image Classification Model Based on Improved Transformer

Zhansheng Tian; Libo Liu

doi:10.3788/LOP220453

[1] Luo J H, Wu J X. A survey on fine-grained image categorization using deep convolutional features[J]. Acta Automatica Sinica, 43, 1306-1318(2017).

[2] Wei X S, Wu J X, Cui Q. Deep learning for fine-grained image analysis: a survey[EB/OL]. https://arxiv.org/abs/1907.03069

[3] Zhang N, Donahue J, Girshick R et al. Part-based R-CNNs for fine-grained category detection[M]. Fleet D, Pajdla T, Schiele B, et al. Computer vision-ECCV 2014. Lecture notes in computer science, 934-849(2014).

[4] Lin T Y, RoyChowdhury A, Maji S. Bilinear CNN models for fine-grained visual recognition[C], 1449-1457(2015).

[5] Fu J L, Zheng H L, Mei T. Look closer to see better: recurrent attention convolutional neural network for fine-grained image recognition[C], 4476-4484(2017).

[6] Zhao B, Wu X, Feng J S et al. Diversified visual attention networks for fine-grained object classification[J]. IEEE Transactions on Multimedia, 19, 1245-1256(2017).

[7] Zhang Z G, Yu P F, Li H Y et al. Fine-grained image recognition of wild mushroom based on multiscale feature guide[J]. Laser & Optoelectronics Progress, 59, 1210016(2022).

[8] Xie S N, Girshick R, Dollár P et al. Aggregated residual transformations for deep neural networks[C], 5987-5995(2017).

[9] Wang B Z, Xiao Z Y. Channel attention multi-branch network for fine-grained image recognition[J]. Laser & Optoelectronics Progress, 58, 2210008(2021).

[10] Wang J N, Gao Y, Shi J et al. Scene classification of optical high-resolution remote sensing images using vision transformer and graph convolutional network[J]. Acta Photonica Sinica, 50, 1128002(2021).

[11] Dosovitskiy A, Beyer L, Kolesnikov A et al. An image is worth 16×16 words: transformers for image recognition at scale[EB/OL]. https://arxiv.org/abs/2010.11929

[12] Guo M H, Liu Z N, Mu T J et al. Beyond self-attention: external attention using two linear layers for visual tasks[EB/OL]. https://arxiv.org/abs/2105.02358

[13] He J, Chen J N, Liu S et al. TransFG: a transformer architecture for fine-grained recognition[EB/OL]. https://arxiv.org/abs/2103.07976

[14] Touvron H, Cord M, Douze M et al. Training data-efficient image transformers & distillation through attention[EB/OL]. https://arxiv.org/abs/2012.12877

[15] He K M, Zhang X Y, Ren S Q et al. Deep residual learning for image recognition[C], 770-778(2016).

[16] Cao Y, Xu J R, Lin S et al. GCNet: non-local networks meet squeeze-excitation networks and beyond[C], 1971-1980(2019).

[17] Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network[EB/OL]. https://arxiv.org/abs/1503.02531

[18] Wah C, Branson S, Welinder P et al. The Caltech-UCSD Birds-200-2011 dataset[R](2011).

[19] Krause J, Stark M, Jia D et al. 3D object representations for fine-grained categorization[C], 554-561(2013).

[20] Khosla A, Jayadevaprakash N, Yao B et al. Novel dataset for fine-grained image categorization: Stanford dogs[EB/OL]. https://people.csail.mit.edu/khosla/papers/fgvc2011.pdf

[21] Cubuk E D, Zoph B, Mane D et al. AutoAugment: learning augmentation policies from data[EB/OL]. https://arxiv.org/abs/1805.09501

[22] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[EB/OL]. https://arxiv.org/abs/1409.1556

[23] Huang G, Liu Z, van der Maaten L et al. Densely connected convolutional networks[C], 2261-2269(2017).

[24] Sun G L, Cholakkal H, Khan S et al. Fine-grained recognition: accounting for subtle differences between similar classes[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 34, 12047-12054(2020).

[25] Luo W, Zhang H M, Li J et al. Learning semantically enhanced feature for fine-grained image classification[J]. IEEE Signal Processing Letters, 27, 1545-1549(2020).

[26] Hu T, Qi H. See better before looking closer: weakly supervised data augmentation network for fine-grained visual classification[EB/OL]. https://arxiv.org/abs/1901.09891

[27] Ji R Y, Wen L Y, Zhang L B et al. Attention convolutional binary neural tree for fine-grained visual categorization[C], 10465-10474(2020).

[28] Sun M, Yuan Y C, Zhou F et al. Multi-attention multi-class constraint for fine-grained image recognition[M]. Ferrari V, Hebert M, Sminchisescu C, et al. Computer vision-ECCV 2018. Lecture notes in computer science, 11220, 805-821(2018).

[29] Chang D, Ding Y, Xie J et al. The devil is in the channels: mutual-channel loss for fine-grained image classification[J]. IEEE Transactions on Image Processing, 29, 4683-4695(2020).

微信扫一扫：分享

微信扫一扫：分享