A Multi-Label Image Classification Method based on Label Correlation Learning Network
WANG Lufang;ZHANG Haiyun
【目的】针对多标签图像分类任务中的标签特征混乱和标签关系局限性问题,提出了一种基于标签相关性学习网络的多标签图像分类方法(MLLCLN)。【方法】采用掩码注意力方法和多头自注意力机制。在掩码注意力方法中,通过图像真实标签对应的状态词向量遮盖注意力机制产生的标签特征,使模型能获得更多上下文信息,一定程度上避免了注意力机制的注意力区域重叠的问题。设计了标签相关性学习网络,该网络是由多层多头注意力机制和图神经网络组成。多头自注意力机制能够基于标签特征学习局部标签关系,而图神经网络使用了现有的方法ML-GCN作为引导,使模型能同时考虑全局标签关系,缓解了由于标签关系局限性导致的模型出现虚假预测的问题。【结果】MLLCLN在公开数据集MSCOCO2014和VOC2007上的实验结果表明了其较好的性能,分类精度分别达到了84.4%和96.0%,为多标签图像分类提供了新思路。
【Purposes】 To meet the challenges posed by label feature confusions and limitations in label relationships in multi-label image classification tasks, a novel approach to multi-label image classification based on label correlation learning network (MLLCLN) is presented in this work. 【Methods】 MLLCLN adopts the methods of masked attention approach and multi-head self- attention mechanism. In the masked attention approach, the label features generated by masking the attention mechanism with state word vectors corresponding to the real labels in the image, allowing the model to obtain more contextual information and mitigating the issue of attention overlap in the at⁃ tention regions. This strategy effectively alleviate the issue of label feature confusion. Moreover, a la⁃ bel correlation learning network is devised, which comprises multiple layers of multi-head attention mechanisms and a graph neural network. On the other hand, the multi-head self-attention mechanism enables the learning of local label relationships according to the label features, while the graph neural network incorporates the widely adopted ML-GCN method to guide the model in considering global label relationships simultaneously, mitigating the issue of false predictions in models caused by the limitations of label relationships. 【Findings】 The experimental results of MLLCLN on the public da⁃ tasets MSCOCO2014 and VOC2007 demonstrate its superior performance, achieving classification accuracies of 84.4% and 96.0%, respectively. This provides a novel approach to multi-label image classification.
multi-head self-attention;multi-label image classification;attention mechanism;adaptive weight;convolutional neural networks
主办单位:煤炭科学研究总院有限公司 中国煤炭学会学术期刊工作委员会