在水下复杂场景下,目标对象具有姿态不同、遮挡和背景复杂等特点,这对卷积网络的特征提取能力提出巨大挑战。Mask R-CNN 算法在水下目标特征提取过程中也存在特征提取能力欠佳的问题,导致算法在水下目标检测准确性较差。因此,提出一种基于Mask R-CNN的改进水下目标目标识别方法。首先可采用金字塔切分的通道注意力模块PAS代替采用了ResNet50的3×3卷积模块,该模块可通过对每个通道进行金字塔的切分,针对通道切分完成后所得出来的通道特征图上的空间信息来进行不用的尺度特征层提取;同时通过采用另一种更加安全稳定和高效的ECANEt通道注意力模块代替PAS模块中的SENet通道注意力模,对多维度的通道注意力权重进行特征重标定;最后对特征金字塔FPN的网络结构进行改进,加强不同特征层之间的信息融合。根据不同场景下进行的实验对比,改进后的网络能够提高水下目标识别的准确率,平均检测精度可达91.3%。本文所提出的改进Mask R-CNN网络模型,能够适应水下复杂多变的场景,为水下目标的识别提供理论依据与技术方案。
In the complex underwater scene, the target object has the characteristics of different poses, occlusion and complex background, which poses a huge challenge to the feature extraction ability of convolutional network. Mask R-CNN algorithm also has the problem of poor feature extraction ability in the process of underwater target feature extraction, which leads to poor accuracy of the algorithm in underwater target detection. Therefore, this paper proposes an improved underwater target recognition method based on Mask R-CNN. First, use the pyramid segmentation attention module PAS to replace 3×3 module in ResNet50. This module first segments the channel, and then extracts the spatial information on each segmented channel feature map without scale features. At the same time, it uses a more efficient ECANet channel attention module to replace the SENet channel attention module in PAS, and recalibrates the multi-dimensional channel attention weight; Finally, the network structure of feature pyramid FPN is improved to strengthen the information fusion between different feature layers. According to the experimental comparison in different scenes, the improved network can improve the accuracy of underwater target recognition, and the average detection accuracy can reach 91.3%. The improved Mask R-CNN network model proposed in this paper can adapt to complex and changeable underwater scenes, providing a theoretical basis and technical solution for underwater target recognition.
2024,46(1): 143-147 收稿日期:2022-11-06
DOI:10.3404/j.issn.1672-7649.2024.01.024
分类号:TN912.34
作者简介:丁元明(1967-),男,教授,研究方向为水下信号处理
参考文献:
[1] 张弓. 基于卷积网络的水下目标识别研究[D]. 镇江: 江苏科技大学, 2020.
[2] FEIZENSZWALB P F, GIRSHICK R B, MCALLESTER D, et al. Object detection with discriminatively trained part-based models[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(9): 1627-1645.
[3] LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-base learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86,(11): 2278-2324.
[4] 李旭冬, 叶茂, 李涛. 基于卷积神经网络的目标检测研究综述[J]. 计算机应用研究, 2017, 34(10): 2881-2886.
[5] AGARWAL S, DU TERRAIL J O, JURIE F. Recent advances in object detection in the age of deep convolutional neural networks[J]. Computer Vision and Pattern Recognition 2018, ar- Xiv: 1809. 03193.
[6] LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single shot multibox detector [C]// Proceedings of the 2016 European Conference on Computer Vision, 2016: 21-37.
[7] REDMON J, DIVVALA S, CIRSHICK R, et al. You only look once: Unified, real-time object detection [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016: 779-788.
[8] LIU W, ANGUELOV D, ERHAN D, et al. Ssd: Single shot multibox detector[C]// European Conference on Computer Vision, 2016: 21-37.
[9] CHEN L C, PAPANDREOU G, KOKKINOS I, et al. Semantic image segmentation with deep convolutional nets and fully connected CRFs[J]. Computer Vision and Pattern Recognition, 2016, arXiv: 1412. 7062.
[10] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towa- rds reatime object detection with region proposal netwoks[C]// Proceedings of the 28th International Conference on Neural In- formation Processing Systems, 2015: 91-99.
[11] SZEGEDY C, LIU W, JIA Y, et al. Going deeper with convolutions[C]// IEEE Conference on Computer Vision and Pattern Recognition, 2015.
[12] 赵丹, 刘洁瑜, 沈强. 一种改进的多门控特征金字塔网络[J]. 光学学报, 2019, 39(8): 235−244
[13] ZHANG Hu, ZU KeKe, LIU Jian, et al. EPSANet: an efficient pyramid split attention block on convolutional neural network network[J]. Computer Vision and Pattern Recognition, 2021, 2105. 14447.
[14] HU Jie, SHEN Li, ALBANIE S, et al. Squeeze-and- excitation network[J]. IEEE Transactions on Pattern Analysis & MachineIntelligence, 2020, 42(8): 2011-2023.
[15] WANG Qilong, WU Banggu, ZHU Pengfei, et al. ECA -Net: Efficient channel attention for deep convolutional neural networks[C]//2020 IEEE/CVF Conference on Computer Vision Pattern Recognition, Seattle, USA, 2020: 115311-11539.