登录    注册    忘记密码    使用帮助

详细信息

HCAM-CL: A Novel Method Integrating a Hierarchical Cross-Attention Mechanism with CNN-LSTM for Hierarchical Image Classification  ( SCI-EXPANDED收录)  

文献类型:期刊文献

英文题名:HCAM-CL: A Novel Method Integrating a Hierarchical Cross-Attention Mechanism with CNN-LSTM for Hierarchical Image Classification

作者:Su, Jing[1];Liang, Jianmin[1];Zhu, Jiayi[1];Li, Yongjiang[1]

机构:[1]Guangdong Ocean Univ, Sch Math & Comp, Zhanjiang 524088, Peoples R China

年份:2024

卷号:16

期号:9

外文期刊名:SYMMETRY-BASEL

收录:SCI-EXPANDED(收录号:WOS:001323331300001)、、Scopus(收录号:2-s2.0-85205121249)、WOS

基金:This research was supported by a special grant from the program for scientific research start-up funds of Guangdong Ocean University under Grant No. 060302102303, the Industry-University-Research Innovation Fund Project of the Science and Technology Development Center of the Ministry of Education under Grant No. 2020QT13, the Ministry of Education's Industry-University-Research Collaborative Education Project under Grant No. 239920011, and the National College Students Innovation and Entrepreneurship Training Program under Grant No. 010403102309.

语种:英文

外文关键词:hierarchical image classification; cross-attention mechanism; CNN-LSTM

外文摘要:Deep learning networks have yielded promising insights in the field of image classification. However, the hierarchical image classification (HIC) task, which involves assigning multiple, hierarchically organized labels to each image, presents a notable challenge. In response to this complexity, we developed a novel framework (HCAM-CL), which integrates a hierarchical cross-attention mechanism with a CNN-LSTM architecture for the HIC task. The HCAM-CL model effectively identifies the relevance between images and their corresponding labels while also being attuned to learning the hierarchical inter-dependencies among labels. Our versatile model is designed to manage both fixed-length and variable-length classification pathways within the hierarchy. In the HCAM-CL model, the CNN module is responsible for the essential task of extracting image features. The hierarchical cross-attention mechanism vertically aligns these features with hierarchical levels, uniformly weighing the importance of different spatial regions. Ultimately, the LSTM module is strategically utilized to generate predictive outcomes by treating HIC as a sequence generation challenge. Extensive experimental evaluations on CIFAR-10, CIFAR-100, and design patent image datasets demonstrate that our HCAM-CL framework consistently outperforms other state-of-the-art methods in hierarchical image classification.

参考文献:

正在载入数据...

版权所有©广东海洋大学 重庆维普资讯有限公司 渝B2-20050021-8 
渝公网安备 50019002500408号 违法和不良信息举报中心