无人水面艇环境感知能力受限于待测目标背景复杂、形状多样和伪装等因素,常规方法难以在上述情况下准确对水下伪装目标准确检测与评价。针对探测场景的多样化和复杂化,基于多任务学习策略提出一种面向无人水面艇的轻量型伪装目标检测方法MFLNet(Multi-Feature Learning Network),通过借助图像梯度感知任务来提升无人水面艇对水下伪装目标的检测能力。首先,将图像特征提取任务解耦为语义特征提取和梯度特征提取;然后,向高层语义特征引入图像梯度特征并通过多尺度通道注意力模块生成初始预测图;最后,经过逐层的特征修正生成对伪装目标的最终预测。实验结果表明:MFLNet在CAMO-Test和NC4K-Test数据集上,结构相似性度量$ {S}_{\alpha } $指标可达0.824和0.851,检测性能达到先进模型水平,相比同策略轻量化模型参数量减少65%,检测速度可达73.7帧/s,满足水下检测数据实时传送需求,具有一定的实际应用价值。
The environmental perception capability of unmanned surface vessels (USVs) is limited by factors such as the complexity of the background, diverse shapes, and camouflage of the targets to be detected. Conventional methods struggle to accurately detect and evaluate underwater camouflage objects in such scenarios. To address the diversification and complexity of detection scenarios, this paper proposes a lightweight camouflage objects detection method for unmanned surface vessels, called MFLNet (Multi-Feature Learning Network), based on a multi-task learning strategy. It enhances the USV's ability to detect underwater camouflage objects by leveraging the image gradient perception task. Initially, the feature extraction task is decoupled into semantic feature extraction and gradient feature extraction. Then, image gradient features are introduced into high-level semantic features, and initial prediction maps are generated through the proposed Multi-Scale Context Attention module. Finally, accurate final predictions are generated through feature corrections at each layer. Experimental results show that MFLNet achieves $ {S}_{\alpha } $ values of 0.824 and 0.851 on the CAMO-Test and NC4K-Test datasets, respectively. Compared to models with the same strategy but reduced parameter counts by 65%, MFLNet achieves a detection speed of 73.7 frames per second, meeting the real-time data transmission requirements for underwater detection and demonstrating practical application value.
2024,46(19): 85-91 收稿日期:2023-11-14
DOI:10.3404/j.issn.1672-7649.2024.19.015
分类号:TP391.41
作者简介:韩天保(1999-),男,硕士研究生,研究方向为船舶与海洋结构物设计制造
参考文献:
[1] 孙远, 姚元, 伍光新, 等. 无人艇信息感知系统发展分析与探讨[J]. 现代雷达, 2022, 44(9): 13-20.
[2] 韩小溪, 盛立, 苏强, 等. 英国“鹦鹉螺”-100新概念潜艇研究[J]. 舰船科学技术, 2020, 42(13): 178-182.
HAN Xiaoxi, SHENG Li, SU qiang, et al. Research on new concept submarine of British Nautilus-100[J]. Ship Science and Technology, 2020, 42(13): 178-182.
[3] 杨婷, 高武奇, 王鹏, 等. 自动色阶与双向特征融合的水下目标检测算法[J]. 激光与光电子学进展, 2023, 60(6): 132-143.
[4] 陈小毛, 王立成, 张健, 等. 融合YOLOv5与ASFF算法的海产品目标检测算法研究[J]. 无线电工程, 2023, 53(4): 824-830.
[5] 曹建荣, 庄园, 汪明, 等. 基于ECA的YOLOv5水下鱼类目标检测[J]. 计算机系统应用, 2023, 32(6): 204-211.
[6] 袁明阳, 宋亚林, 张潮, 等. 基于GA-RetinaNet的水下目标检测[J]. 计算机系统应用, 2023, 32(6): 80-90.
[7] 周华平, 宋明龙, 孙克雷. 一种轻量化的水下目标检测算法SG-Det[J]. 光电子·激光, 2023, 34(2): 156-165.
[8] FAN D P, JI G P, SUN G, et al. Camouflaged object detection[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020: 2777-2787.
[9] 史彩娟, 任弼娟, 王子雯, 等. 基于深度学习的伪装目标检测综述[J]. 计算机科学与探索, 2022, 16(12): 2734-2751.
[10] 陈炜玲, 邱艳玲, 赵铁松, 等. 面向海洋的水下图像处理与视觉技术进展[J/OL]. 信号处理: 1-17[2023-09-30].
[11] MEHTA S, RASTEGARI M. Mobilevit: light-weight, general-purpose, and mobile-friendly vision transformer[J]. arXiv Preprint arXiv: 2110.02178, 2021.
[12] KE Z, SUN J, LI K, et al. Modnet: real-time trimap-free portrait matting via objective decomposition[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2022, 36(1): 1140-1147.
[13] FAN D P, JI G P, CHENG M M, et al. Concealed object detection[J]. IEEE transactions on Pattern Analysis and Machine Intelligence, 2021, 44(10): 6024-6042.
[14] JI G P, FAN D P, CHOU Y C, et al. Deep gradient learning for efficient camouflaged object detection[J]. Machine Intelligence Research, 2023, 20(1): 92-108.
[15] SANDLER M, HOWARD A, ZHU M, et al. Mobilenetv2: Inverted residuals and linear bottlenecks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition, 2018: 4510-4520.
[16] HUANG Q, XIA C, WU C, et al. Semantic segmentation with reverse attention[J]. arXiv Preprint arXiv: 1707.06426, 2017.
[17] MEI H, JI G P, WEI Z, et al. Camouflaged object segmentation with distraction mining[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 8772-8781.
[18] MEHTA S, RASTEGARI M. Separable self-attention for mobile vision transformers[J]. arXiv preprint arXiv: 2206.02680, 2022.
[19] SERGEY I, CHRISTIAN S. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In ICML, 2015.
[20] WEI J, WANG S, HUANG Q. F3Net: fusion, feedback and focus for salient object detection[C]//Proceedings of the AAAI conference on artificial intelligence, 2020, 34(7): 12321-12328.
[21] LOSHCHILOV I, HUTTER F. Stochastic gradient descent with warm restarts[C]//Proceedings of the 5th Int. Conf. Learning Representations, 1-16.
[22] KINGMA D P, BA J. Adam: A method for stochastic optimization[J]. arXiv preprint arXiv: 1412.6980, 2014.
[23] LE T N, NGUYEN T V, NIE Z, et al. Anabranch network for camouflaged object segmentation[J]. Computer Vision and Image Understanding, 2019, 184: 45-56.
[24] LV Y, ZHANG J, DAI Y, et al. Simultaneously localize, segment and rank the camouflaged objects[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 11591-11601.
[25] FAN D P, CHENG M M, LIU Y, et al. Structure-measure: A new way to evaluate foreground maps[C]//Proceedings of the IEEE international conference on computer vision, 2017: 4548-4557.
[26] FAN D P, GONG C, CAO Y, et al. Enhanced-alignment measure for binary foreground map evaluation[J]. arXiv Preprint arXiv: 1805.10421, 2018.
[27] MARGOLIN R, ZELNIK-MANOR L, TAL A. How to evaluate foreground maps?[C]//Proceedings of the IEEE conference on computer vision and pattern recognition, 2014: 248-255.
[28] PERAZZI F, KR?HENBüHL P, PRITCH Y, et al. Saliency filters: Contrast based filtering for salient region detection[C]//2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2012: 733-740.
[29] FAN Dengping, JI Gepeng, ZHOU Tao, et al. "PraNet: Parallel Reverse Attention Network for Polyp Segmentation.[J]" ArXiv abs/2006.11392, 2020(3): 1-11.
[30] QIN Xuebin, DENG Pingfan, HUANG Chenyang, et al. "Boundary-Aware Segmentation Network for Mobile and Web Applications." ArXiv abs/2101.04704, 2021(5): 1-19.
[31] YUAN Li, TAYFEN, LI G, et al. Revisiting Knowledge Distillation via Label Smoothing Regularization, [C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020): 3902-3910.
[32] ZHAO Jiaxing, LIU J J, FAN D P, et al. EGNet: Edge Guidance Network for Salient Object Detection, [C]//2019 IEEE/CVF International Conference on Computer Vision (ICCV) (2019): 8778-8787.
[33] JI Gepeng, ZHU L, ZHU G M, et al. Fast Camouflaged Object Detection via Edge-based Reversible Re-calibration Network[J]. Pattern Recognit. 2022, 123: 108414.