Research on coal-gangue identification technology driven by multi-source fusion of image features and vibration spectrum
LI Libao;YUAN Yong;QIN Zhenghan;LI Bo;YAN Zhengtian;LI Yong
针对目前图像与振动信号融合的方法在煤矸识别领域应用存在特征融合困难、实时性和模型复杂度不满足实际应用要求等问题,设计了基于多头注意力(MA)的多层长短期记忆(ML−LSTM)模型MA−ML−LSTM。采用经粒子群优化(PSO)算法优化的变分模态分解(VMD)算法对振动信号进行处理,将能量、能量矩、峭度、波形因数与矩阵奇异值作为特征量,并采用一维卷积网络获取振动信息;在多分类网络ResNet−18基础上删除最后的全连接层,用于对煤矸图像进行深度特征提取;通过MA机制和ML−LSTM网络实现图像与振动双通道特征融合,强化各通道重要特征信息的表达。实验结果表明:MA−ML−LSTM模型的平均识别准确率达98.72%,相比传统单一的ResNet,MobilenetV3,1D−CNN,LSTM模型分别高4.60%,7.96%,5.37%,6.11%,相比EMD−RF,IMF−SVM,CSPNet−YOLOv7分别高4.18%,4.45%,3.46%,验证了图像特征与振动频谱多源融合驱动的煤矸识别技术的有效性。
To address the challenges of feature fusion, real-time performance, and model complexity in the application of image and vibration signal fusion for coal-gangue identification, a multi-head attention (MA)-based multi-layer long short-term memory (ML-LSTM) model, i.e., MA-ML-LSTM, was proposed. The variational mode decomposition (VMD) algorithm, optimized by particle swarm optimization (PSO), was employed to process vibration signals. Features such as energy, energy moment, kurtosis, waveform factor, and matrix singular values were extracted. A one-dimensional convolutional network was used to acquire vibration information. For image feature extraction, the fully connected layer of the multi-classification network ResNet-18 was removed, enabling the extraction of deep features from coal-gangue images. Dual-channel feature fusion of images and vibration signals was achieved using the MA mechanism and the ML-LSTM network, enhancing the expression of significant features in each channel. Experimental results demonstrated that the MA-ML-LSTM model achieved an average recognition accuracy of 98.72%, which was 4.60%, 7.96%, 5.37%, and 6.11% higher than traditional single models ResNet, MobilenetV3, 1D-CNN, and LSTM, respectively. Compared to EMD-RF, IMF-SVM, and CSPNet-YOLOv7 models, accuracy improved by 4.18%, 4.45%, and 3.46%, respectively. These findings validate the effectiveness of the coal-gangue identification technology driven by multi-source fusion of image features and vibration spectrum.
coal-gangue identification;multi-source information fusion;vibration signals;image recognition;multi-head attention mechanism;multi-layer long short-term memory model
主办单位:煤炭科学研究总院有限公司 中国煤炭学会学术期刊工作委员会