• 论文
主办单位:煤炭科学研究总院有限公司、中国煤炭学会学术期刊工作委员会
用于实时语义分割的丰富语义提取器网络
  • Title

    Rich semantic extractor network for real-time semantic segmentation

  • 作者

    赵珊田楷文孙君顶

  • Author

    ZHAO Shan;TIAN Kaiwen;SUN Junding

  • 单位

    河南理工大学软件学院河南理工大学计算机科学与技术学院

  • Organization
    School of Software,Henan Polytechnic University
    School of Computer Science and Technology,Henan Poly⁃technic University
  • 摘要

    目的目的由于推理速度限制,网络深度较浅,实时语义分割网络提取的语义特征信息不足。此外,较浅的网络深度也限制了特征提取网络的能力,降低了其鲁棒性和适应能力。为此,方法提出一种用于实时语义分割的丰富语义提取器网络。首先针对语义特征信息提取不足的问题,引入丰富语义提取器,丰富语义提取器包括多尺度全局语义提取模块和语义融合模块。其次,利用多尺度全局语义提取模块可以提取丰富的多尺度全局语义,扩大网络的有效感受野,同时语义融合模块将多尺度局部语义与多尺度全局语义高效融合,使网络拥有更全面更丰富的语义信息。最后针对细节分支和语义分支的特点设计空间重构聚合模块,建模细节特征的上下文信息,增强特征表示,使2个分支高效聚合。结果结果在Cityscapes和ADE20K数据集上进行全面实验,所提出的RSENet分别以76帧/s和67帧/s的推理速度达到了75.6%和35.7%的MIoU。结论结论实验结果表明,在复杂场景语义信息的提取方面,本文所提出的网络能够深入挖掘并准确捕捉图像中语义信息。同时,在精度与速度的平衡方面也展现出了卓越的性能,不仅能够实现高精度的语义分割,而且推理速度非常快。这种高效的图像分割能力使得网络在实际应用场景中具有极高的实用性和可操作性。

  • Abstract

    Objectives The inference speed of the real-time semantic segmentation network is limited,the depth of the network is shallow,which lead to insufficient semantic feature information extracted.Addition‐ally,the shallow network depth restricts the capability of feature extraction networks,reducing their robust‐ness and adaptability. In order to solve such the problems, Methods a rich semantic extractor network(RSENet) for real-time semantic segmentation was proposed.Firstly,aiming at the problem of inadequate se‐mantic feature information extraction,a rich semantic extractor(RSE) was introduced,which included a multi-scale global semantic extraction module(MGSEM) and a semantic fusion module(SFM).MGSEM was used to extract rich multi-scale global semantics and expand the effective receptive field of the network.At the same time,SFM efficiently fused multi-scale local semantics and multi-scale global semantics,so that the network had more comprehensive and rich semantic information.Finally,according to the characteristics of the detailed branch and the semantic branch,a space reconstruction aggregation module(SRAM) was de‐signed to model the context information of the detailed features and enhanced the feature representation,so that the two branches could be efficiently aggregated. Results Comprehensive experiments were conducted on Cityscapes and ADE20K datasets,and the proposed RSENet achieved mIoU of 75.6% and 35.7% at in‐ference speed of 76 frames/s and 67 frames/s,respectively. Conclusions The experimental results suggested that in the extraction of semantic information within complex scenes,the network proposed in this paper was able to deeply explore and accurately capture such semantic information in images.Furthermore,outstanding performance was demonstrated in achieving a balance between accuracy and speed,with the network not only capable of achieving high-precision semantic segmentation but also exhibiting very fast inference speeds.This efficient image segmentation capability endowed the network with high practicality and operabil‐ity in real-world application scenarios.

  • 关键词

    语义分割多尺度特征视觉Transformer特征融合

  • KeyWords

    semantic segmentation;multi-scale feature;vision Transformer;feature fusion

  • 基金项目(Foundation)
    国家自然科学基金资助项目(62276092)
  • DOI
  • 引用格式
    赵珊,田楷文,孙君顶 . 用于实时语义分割的丰富语义提取器网络[J]. 河南理工大学学报(自然科学版),2024,43(6):146‐155.
  • Citation
    ZHAO S,TIAN K W,SUN J D. Rich semantic extractor network for real-time semantic segmentation[J]. Journal of Henan Polytechnic University(Natural Science),2024,43(6):146-155.
相关问题

主办单位:煤炭科学研究总院有限公司 中国煤炭学会学术期刊工作委员会

©版权所有2015 煤炭科学研究总院有限公司 地址:北京市朝阳区和平里青年沟东路煤炭大厦 邮编:100013
京ICP备05086979号-16  技术支持:云智互联