《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (1): 129-137.DOI: 10.11772/j.issn.1001-9081.2023010075

• 人工智能 • 上一篇    

基于尺度注意知识迁移的自蒸馏目标分割方法

王晓兵1,2, 张雄伟1(), 曹铁勇1, 郑云飞1,2,3, 王勇2,3   

  1. 1.陆军工程大学 指挥控制工程学院, 南京 210001
    2.陆军炮兵防空兵学院(南京校区), 南京 211131
    3.安徽省偏振成像与探测重点实验室(陆军炮兵防空兵学院), 合肥 230031
  • 收稿日期:2023-01-31 修回日期:2023-04-25 接受日期:2023-05-04 发布日期:2023-06-06 出版日期:2024-01-10
  • 通讯作者: 张雄伟
  • 作者简介:王晓兵(1981—),男,安徽滁州人,讲师,博士,主要研究方向:智能信息处理、深度学习;
    曹铁勇(1971—),男,江苏南京人,教授,博士,主要研究方向:图像处理、机器学习;
    郑云飞(1983—),男,安徽滁州人,讲师,博士,主要研究方向:伪装目标识别、深度学习;
    王勇(1983—),男,安徽合肥人,讲师,硕士,主要研究方向:多模态图像处理、模式识别。
    第一联系人:张雄伟(1965—),男,浙江嘉兴人,教授,博士,主要研究方向:多媒体信息处理、机器学习;
  • 基金资助:
    国家自然科学基金资助项目(61801512)

Self-distillation object segmentation method via scale-attention knowledge transfer

Xiaobing WANG1,2, Xiongwei ZHANG1(), Tieyong CAO1, Yunfei ZHENG1,2,3, Yong WANG2,3   

  1. 1.Institute of Command and Control Engineering,Army Engineering University,Nanjing Jiangsu 210001,China
    2.Army Academy of Artillery and Air Defense (Nanjing Campus),Nanjing Jiangsu 211131,China
    3.Anhui Key Laboratory of Polarization Imaging and Detection(Army Academy of Artillery and Air Defense),Hefei Anhui 230031,China
  • Received:2023-01-31 Revised:2023-04-25 Accepted:2023-05-04 Online:2023-06-06 Published:2024-01-10
  • Contact: Xiongwei ZHANG
  • About author:WANG Xiaobing, born in 1981, Ph. D., lecturer. His research interests include intelligent information processing, deep learning.
    CAO Tieyong, born in 1971, Ph. D., professor. His research interests include image processing, machine learning.
    ZHENG Yunfei, born in 1983, Ph. D., lecturer. His research interests include camouflage object recognition, deep learning.
    WANG Yong, born in 1983, M. S., lecturer. His research interests include multimodal image processing, pattern recognition.
  • Supported by:
    National Natural Science Foundation of China(61801512)

摘要:

当前的目标分割模型难以兼顾分割性能与推断效率,为此提出一种基于尺度注意知识迁移的自蒸馏目标分割方法。首先,构建了一个仅利用主干特征的目标分割网络作为推断网络,实现高效的前向推断过程。其次,提出了一种基于尺度注意知识的自蒸馏学习模型:一方面,设计了具有尺度注意机制的金字塔特征模块,利用尺度注意机制自适应地捕获不同语义水平的上下文信息,提取更具区分性的自蒸馏知识;另一方面,融合交叉熵、KL(Kullback-Leibler)散度和L2距离构造蒸馏损失,高效驱动蒸馏知识向分割网络迁移,提升泛化性能。该方法在COD(Camouflaged Object Detection)、DUT-O(Dalian University of Technology-OMRON)、SOC(Salient Objects in Clutter)等五个目标分割数据集上进行了验证:将所提推断网络作为基准网络,所提自蒸馏模型分割性能在Fβ 指标上平均提升3.01%,比免教师(TF)自蒸馏模型增加了1.00%;所提网络与近期的残差分割网络(R2Net)相比,参数量减少了2.33×106,推断帧率提升了2.53%,浮点运算量减少了40.50%,分割性能提升了0.51%。实验结果表明:所提方法能有效兼顾性能与效率,适用于计算和存储资源受限的应用场景。

关键词: 自蒸馏, 目标分割, 知识迁移, 尺度注意机制, 金字塔知识表示

Abstract:

It is difficult for current object segmentation models to reach a good balance between segmentation performance and inference efficiency. To solve this challenge, a self-distillation object segmentation method via scale-attention knowledge transfer was proposed. Firstly, an object segmentation network only using features in backbone was constructed as the inference network, to achieve efficient forward inference process. Secondly, a self-distillation learning model via scale-attention knowledge was proposed. On the one hand, a scale-attention pyramid feature module was designed to adaptively capture context information at different semantic levels and extract more discriminative self-distillation knowledge. On the other hand, a distillation loss was constructed by fusing cross entropy, KL (Kullback-Leibler) divergence and L2 distance. It drove distillation knowledge to transfer into segmentation network efficiently to improve its generalization performance. The method was verified on five public object segmentation datasets of COD (Camouflaged Object Detection), DUT-O (Dalian University of Technology-OMRON), SOC (Salient Objects in Clutter), etc.: considering the proposed inference network as the baseline network, the proposed self-distillation model can increase the segmentation performance by 3.01% on Fβ metric, which was 1.00% higher better than that of Teacher-Free (TF) self-distillation model; compared with recent Residual learning Net (R2Net), the proposed object segmentation network reduces the number of parameters by 2.33×106, improves the inference frame rate by 2.53%, decreases the floating-point operations by 40.50%, and increases segmentation performance by 0.51%. Experimental results show that the proposed self-distillation segmentation method can balance performance and efficiency, and is suitable for scenarios with limited computing and storage resources.

Key words: self-distillation, object segmentation, knowledge transfer, scale-attention mechanism, pyramid knowledge representation

中图分类号: