Cui, Yang; Zhang, Juan (2025) MFEAM: Multi-View Feature Enhanced Attention Model for Image Captioning. Applied Sciences, 15 (15). doi:10.3390/app15158368

Library Home Bookshelves View by Type Using Search

Books Catalogs/Sales Lists Journals Reports Thesis/Dissertation

Search for Books Search for Journals Manage Subjects Statistics Books without DDC/LCC Top Unstructured Orphaned Articles

Bookshelves (DDC layout)Bookshelves (LCC layout)Latest Books

Advanced

Search inside 'Applied Sciences' only

- Only viewable:

Reference Type	Journal (article/letter/editorial)
Title	MFEAM: Multi-View Feature Enhanced Attention Model for Image Captioning
Journal	Applied Sciences
Authors	Cui, Yang		Author
Authors	Zhang, Juan		Author
Year	2025 (July 28)	Volume	15
Issue	15
Publisher	MDPI AG
DOI	doi:10.3390/app15158368Search in ResearchGate
	Generate Citation Formats
Mindat Ref. ID	18781277	Long-form Identifier	mindat:1:5:18781277:3
GUID	0
Full Reference	Cui, Yang; Zhang, Juan (2025) MFEAM: Multi-View Feature Enhanced Attention Model for Image Captioning. Applied Sciences, 15 (15). doi:10.3390/app15158368
Plain Text	Cui, Yang; Zhang, Juan (2025) MFEAM: Multi-View Feature Enhanced Attention Model for Image Captioning. Applied Sciences, 15 (15). doi:10.3390/app15158368
In	(2025, July) Applied Sciences Vol. 15 (15). MDPI AG

References Listed

These are the references the publisher has listed as being connected to the article. Please check the article itself for the full list of references which may differ. Not all references are currently linkable within the Digital Library.

	Not Yet Imported: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) - proceedings-article : 10.1109/CVPR.2015.7298935 If you would like this item imported into the Digital Library, please contact us quoting Journal ID
	Not Yet Imported: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) - proceedings-article : 10.1109/CVPR.2015.7298932 If you would like this item imported into the Digital Library, please contact us quoting Journal ID
	Not Yet Imported: - proceedings-article : 10.1109/CVPR.2018.00583 If you would like this item imported into the Digital Library, please contact us quoting Journal ID
	Not Yet Imported: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition - proceedings-article : 10.1109/CVPR.2018.00636 If you would like this item imported into the Digital Library, please contact us quoting Journal ID
	Not Yet Imported: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) - proceedings-article : 10.1109/CVPR.2017.131 If you would like this item imported into the Digital Library, please contact us quoting Journal ID
	Not Yet Imported: - journal-article : 10.1016/j.neucom.2020.03.087 If you would like this item imported into the Digital Library, please contact us quoting Journal ID
	Not Yet Imported: - report : 10.21236/ADA623249 If you would like this item imported into the Digital Library, please contact us quoting Journal ID
	Not Yet Imported: - proceedings-article : 10.1145/2964284.2964299 If you would like this item imported into the Digital Library, please contact us quoting Journal ID
	Not Yet Imported: - proceedings-article : 10.1109/CVPR.2017.345 If you would like this item imported into the Digital Library, please contact us quoting Journal ID
	Not Yet Imported: - journal-article : 10.1016/j.neucom.2018.08.069 If you would like this item imported into the Digital Library, please contact us quoting Journal ID
	Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., and Clark, J. (2021, January 18–24). Learning transferable visual models from natural language supervision. Proceedings of the International Conference on Machine Learning, PmLR, Virtual.
	Not Yet Imported: - journal-article : 10.1016/j.eswa.2022.117174 If you would like this item imported into the Digital Library, please contact us quoting Journal ID
	Not Yet Imported: 2022 26th International Conference on Pattern Recognition (ICPR) - proceedings-article : 10.1109/ICPR56361.2022.9955644 If you would like this item imported into the Digital Library, please contact us quoting Journal ID
	Zhang (2024) Int. J. Comput. Appl. Mobilenet V3-transformer, a lightweight model for image caption 46, 1
	Chen, J., Ge, C., Xie, E., Wu, Y., Yao, L., Ren, X., Wang, Z., Luo, P., Lu, H., and Li, Z. PIXART-sigma: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation. Proceedings of the European Conference on Computer Vision.
	Moratelli, N., Caffagni, D., Cornia, M., Baraldi, L., and Cucchiara, R. (2024). Revisiting Image Captioning Training Paradigm via Direct CLIP-based Optimization. arXiv.
	Wang, F., Mei, J., and Yuille, A. Sclip: Rethinking self-attention for dense vision-language inference. Proceedings of the European Conference on Computer Vision.
	Moratelli, N., Cornia, M., Baraldi, L., and Cucchiara, R. Fluent and Accurate Image Captioning with a Self-Trained Reward Model. Proceedings of the International Conference on Pattern Recognition.
	Tarvainen, A., and Valpola, H. (2017, January 4–9). Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. Proceedings of the 31st International Conference on Neural Information Processing System, Long Beach, CA, USA.
	Gu, Y., Dong, L., Wei, F., and Huang, M. (2024, January 7–11). MiniLLM: Knowledge distillation of large language models. Proceedings of the Twelfth International Conference on Learning Representations, Vienna Austria.
	Kang (2024) Adv. Neural Inf. Process. Syst. Knowledge-augmented reasoning distillation for small language models in knowledge-intensive tasks 36, 48573
	Li, Z., Li, X., Fu, X., Zhang, X., Wang, W., Chen, S., and Yang, J. (2024, January 16–22). Promptkd: Unsupervised prompt distillation for vision-language models. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.
	Nguyen (2024) Adv. Neural Inf. Process. Syst. Improving multimodal datasets with image captioning 36, 22047
	Mahmoud, A., Elhoushi, M., Abbas, A., Yang, Y., Ardalani, N., Leather, H., and Morcos, A.S. (2024, January 16–22). Sieve: Multimodal dataset pruning using image captioning models. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.
	Awadalla, A., Xue, L., Shu, M., Yan, A., Wang, J., Purushwalkam, S., Shen, S., Lee, H., Lo, O., and Park, J.S. (2024). BLIP3-KALE: Knowledge Augmented Large-Scale Dense Captions. arXiv.
	Yu, Q., Sun, Q., Zhang, X., Cui, Y., Zhang, F., Cao, Y., Wang, X., and Liu, J. (2024, January 16–20). Capsfusion: Rethinking image-text data at scale. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.
	Chen, L., Li, J., Dong, X., Zhang, P., He, C., Wang, J., Zhao, F., and Lin, D. Sharegpt4v: Improving large multi-modal models with better captions. Proceedings of the European Conference on Computer Vision.
	Not Yet Imported: - proceedings-article : 10.1109/CVPR52688.2022.01949 If you would like this item imported into the Digital Library, please contact us quoting Journal ID
	Not Yet Imported: - journal-article : 10.1007/s00530-022-01036-z If you would like this item imported into the Digital Library, please contact us quoting Journal ID
	Yang (2024) IEEE Trans. Geosci. Remote Sens. Bootstrapping interactive image-text alignment for remote sensing image captioning 62, 1
	Not Yet Imported: - journal-article : 10.1007/s11042-024-18150-x If you would like this item imported into the Digital Library, please contact us quoting Journal ID
	Not Yet Imported: - book-chapter : 10.1007/978-3-030-01264-9_42 If you would like this item imported into the Digital Library, please contact us quoting Book ID 9783030012632
	Vaswani (2017) Adv. Neural Inf. Process. Syst. Attention is all you need 17, 6000
	Hinton, G. (2015). Distilling the Knowledge in a Neural Network. arXiv.
	Not Yet Imported: - proceedings-article : 10.1109/ICCVW54120.2021.00350 If you would like this item imported into the Digital Library, please contact us quoting Journal ID
	Sameni, S., Kafle, K., Tan, H., and Jenni, S. (2024, January 16–22). Building Vision-Language Models on Solid Foundations with Masked Distillation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.
	Ren (2025) Appl. Intell. EDIR: An expert method for describing image regions based on knowledge distillation and triple fusion 55, 62
	Not Yet Imported: - proceedings-article : 10.1109/CVPR42600.2020.00483 If you would like this item imported into the Digital Library, please contact us quoting Journal ID
	Bajpai, D.J., and Hanawal, M.K. (2024). CAPEEN: Image Captioning with Early Exits and Knowledge Distillation. arXiv.
	*Cohen, G. E. (1997) ALIGN: a program to superimpose protein coordinates, accounting for insertions and deletions. Journal of Applied Crystallography, 30 (6). 1160-1161 doi:10.1107/s0021889897006729*
	Xiao, B., Wu, H., Xu, W., Dai, X., Hu, H., Lu, Y., Zeng, M., Liu, C., and Yuan, L. (2024, January 16–22). Florence-2: Advancing a unified representation for a variety of vision tasks. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.
	Not Yet Imported: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) - proceedings-article : 10.1109/CVPR.2015.7299087 If you would like this item imported into the Digital Library, please contact us quoting Journal ID
	Not Yet Imported: - book-chapter : 10.1007/978-3-319-10602-1_48 If you would like this item imported into the Digital Library, please contact us quoting Book ID 9783319106014
	Not Yet Imported: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics - ACL '02 - proceedings-article : 10.3115/1073083.1073135 If you would like this item imported into the Digital Library, please contact us quoting Journal ID
	Banerjee, S., and Lavie, A. (2005, January 29). METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. Proceedings of the acl Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, Ann Arbor, MI, USA.
	Lin, C.Y. (2004, January 22). Rouge: A package for automatic evaluation of summaries. Proceedings of the Text Summarization Branches Out, Barcelona, Spain.
	Not Yet Imported: - book-chapter : 10.1007/978-3-319-46454-1_24 If you would like this item imported into the Digital Library, please contact us quoting Book ID 9783319464534
	Not Yet Imported: - proceedings-article : 10.18653/v1/P16-1162 If you would like this item imported into the Digital Library, please contact us quoting Journal ID
	Yao, T., Pan, Y., Li, Y., and Mei, T. (November, January 27). Hierarchy parsing for image captioning. Proceedings of the IEEE/CVF INTERNATIONAL Conference on Computer Vision, Seoul, Republic of Korea.
	Wu, M., Zhang, X., Sun, X., Zhou, Y., Chen, C., Gu, J., Sun, X., and Ji, R. (November, January 27). Difnet: Boosting visual information flow for image captioning. Proceedings of the IEEE/CVF Conference on Computer vision and Pattern Recognition, Seoul, Republic of Korea.
	Huang, L., Wang, W., Chen, J., and Wei, X.Y. (November, January 27). Attention on attention for image captioning. Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea.
	Not Yet Imported: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) - proceedings-article : 10.1109/CVPR42600.2020.01098 If you would like this item imported into the Digital Library, please contact us quoting Journal ID
	Not Yet Imported: - journal-article : 10.1609/aaai.v35i3.16328 If you would like this item imported into the Digital Library, please contact us quoting Journal ID
	Not Yet Imported: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) - proceedings-article : 10.1109/CVPR46437.2021.01521 If you would like this item imported into the Digital Library, please contact us quoting Journal ID
	Not Yet Imported: - journal-article : 10.1007/s00530-023-01230-7 If you would like this item imported into the Digital Library, please contact us quoting Journal ID
	Not Yet Imported: Applied Intelligence - journal-article : 10.1007/s10489-022-03624-y If you would like this item imported into the Digital Library, please contact us quoting Journal ID

	Li, Jingwen; Zhao, Mengke; Wei, Xiaoru; Shao, Yusen; Wang, Qingyang; Yang, Zhenxin (2025) MDNet: A Differential-Perception-Enhanced Multi-Scale Attention Network for Remote Sensing Image Change Detection. Applied Sciences, 15 (16). doi:10.3390/app15168794
	Li, Yunpeng, Tao, Chengjin, Liu, Meng, Zhang, Xiangrong, Wang, Guanchun, Zhang, Tianyang, Zhao, Dong, Wang, Dabao (2025) Feature refinement and rethinking attention for remote sensing image captioning. Scientific Reports, 15 (1). doi:10.1038/s41598-025-93125-y
	Jama, Bashir Sheikh Abdullahi; Hacibeyoglu, Mehmet (2025) A GAN-Based Framework with Dynamic Adaptive Attention for Multi-Class Image Segmentation in Autonomous Driving. Applied Sciences, 15 (15). doi:10.3390/app15158162
	Chang, Xueli; Wang, Xiaodong; Huang, Xiaoyu; Yan, Meng; Cheng, Luxiao (2025) Multi-Scale Differentiated Network with Spatial–Spectral Co-Operative Attention for Hyperspectral Image Denoising. Applied Sciences, 15 (15). doi:10.3390/app15158648
	Zhang, Sen; Du, Weilin; Liu, Yuan; Zhou, Ni; Li, Zheng (2025) Feature Enhancement Network for Infrared Small Target Detection in Complex Backgrounds Based on Multi-Scale Attention Mechanism. Applied Sciences, 15 (9). doi:10.3390/app15094966
	Yang, Qing; Wei, Ying; Liu, Fei; Wu, Zhuang (2025) An Accurate and Efficient Diabetic Retinopathy Diagnosis Method via Depthwise Separable Convolution and Multi-View Attention Mechanism. Applied Sciences, 15 (17). doi:10.3390/app15179298
	*Li, Yifan; Wu, Gengshen (2025) Multi-Scale Feature Fusion and Global Context Modeling for Fine-Grained Remote Sensing Image Segmentation. Applied Sciences, 15 (10). doi:10.3390/app15105542*
	*Shi, Jihui; Huang, Jijiang; Guan, Lei; Chen, Weining (2025) Multi-Feature Fusion Diffusion Post-Processing for Low-Light Image Denoising. Applied Sciences, 15 (16). doi:10.3390/app15168850*
	*Dai, Wenbin, Ma, Yuxin, Fan, Yan, Ma, Jun (2025) A Multi-Scale Feature Extraction Algorithm for Chinese Herbal Medicine Image Classification. Applied Sciences, 15 (8). doi:10.3390/app15084271*
	Xie, Tian, Ding, Weiping, Zhang, Jinbao, Wan, Xusen, Wang, Jiehua (2023) Bi-LS-AttM: A Bidirectional LSTM and Attention Mechanism Model for Improving Image Captioning. Applied Sciences, 13 (13) 7916 doi:10.3390/app13137916
	*Zhao, Yang; Hu, Liangchen; Xu, Sen (2025) Multi-Scale Context Fusion Method with Spatial Attention for Accurate Crop Disease Detection. Applied Sciences, 15 (17). doi:10.3390/app15179341*

Mindat.org is an outreach project of the Hudson Institute of Mineralogy, a 501(c)(3) not-for-profit organization.
Copyright © mindat.org and the Hudson Institute of Mineralogy 1993-2025, except where stated. Most political location boundaries are © OpenStreetMap contributors. Mindat.org relies on the contributions of thousands of members and supporters. Founded in 2000 by Jolyon Ralph.
To cite: Ralph, J., Von Bargen, D., Martynov, P., Zhang, J., Que, X., Prabhu, A., Morrison, S. M., Li, W., Chen, W., & Ma, X. (2025). Mindat.org: The open access mineralogy database to accelerate data-intensive geoscience research. American Mineralogist, 110(6), 833–844. doi:10.2138/am-2024-9486.
Privacy Policy - Terms & Conditions - Contact Us / DMCA issues - Report a bug/vulnerability Current server date and time: August 31, 2025 08:20:08

Go to top of page

Cui, Yang; Zhang, Juan (2025) MFEAM: Multi-View Feature Enhanced Attention Model for Image Captioning. Applied Sciences, 15 (15). doi:10.3390/app15158368

References Listed

See Also