基于深度学习的水下目标识别技术

公告通知

下载文档

联系方式

主管单位:: 中国船舶集团有限公司

主办单位:: 中国舰船研究院、中国船舶集团有限公司第七一四研究所

编辑出版:: 《舰船科学技术》编辑部

联系地址:: 北京市朝阳区科荟路55号院

邮编:: 100101

电话:: 陈老师：010-83027277
宋老师：010-83027276
李老师：010-83027269
梁老师：010-83027281

邮箱:: jckxjs@163.com

ISSN:: 1672-7649

CN:: 11-1885/U

友情链接

当前位置：首页 > 过刊浏览->2024年46卷1期

基于深度学习的水下目标识别技术
Underwater target recognition technology based on deep learning

DOI:

作者:: 丁元明^1,2, 徐利华^1,2, 侯孟珂^1,2
DING Yuan-ming^1,2, XU Li-hua^1,2, HOU Meng-ke^1,2

作者单位:: 1. 大连大学信息工程学院，辽宁大连 116622;
2. 大连大学通信与网络重点实验室，辽宁大连 116622
1. College of Information Engineering, Dalian University, Dalian 116622, China;
2. Communication and Network Laboratory, Dalian University, Dalian 116622, China

关键词:: 水下目标识别;Mask R-CNN;深度学习
underwater target recognition; Mask R-CNN; deep learning

摘要:: 在水下复杂场景下，目标对象具有姿态不同、遮挡和背景复杂等特点，这对卷积网络的特征提取能力提出巨大挑战。Mask R-CNN 算法在水下目标特征提取过程中也存在特征提取能力欠佳的问题，导致算法在水下目标检测准确性较差。因此，提出一种基于Mask R-CNN的改进水下目标目标识别方法。首先可采用金字塔切分的通道注意力模块PAS代替采用了ResNet50的3×3卷积模块，该模块可通过对每个通道进行金字塔的切分，针对通道切分完成后所得出来的通道特征图上的空间信息来进行不用的尺度特征层提取；同时通过采用另一种更加安全稳定和高效的ECANEt通道注意力模块代替PAS模块中的SENet通道注意力模，对多维度的通道注意力权重进行特征重标定；最后对特征金字塔FPN的网络结构进行改进，加强不同特征层之间的信息融合。根据不同场景下进行的实验对比，改进后的网络能够提高水下目标识别的准确率，平均检测精度可达91.3%。本文所提出的改进Mask R-CNN网络模型，能够适应水下复杂多变的场景，为水下目标的识别提供理论依据与技术方案。
In the complex underwater scene, the target object has the characteristics of different poses, occlusion and complex background, which poses a huge challenge to the feature extraction ability of convolutional network. Mask R-CNN algorithm also has the problem of poor feature extraction ability in the process of underwater target feature extraction, which leads to poor accuracy of the algorithm in underwater target detection. Therefore, this paper proposes an improved underwater target recognition method based on Mask R-CNN. First, use the pyramid segmentation attention module PAS to replace 3×3 module in ResNet50. This module first segments the channel, and then extracts the spatial information on each segmented channel feature map without scale features. At the same time, it uses a more efficient ECANet channel attention module to replace the SENet channel attention module in PAS, and recalibrates the multi-dimensional channel attention weight; Finally, the network structure of feature pyramid FPN is improved to strengthen the information fusion between different feature layers. According to the experimental comparison in different scenes, the improved network can improve the accuracy of underwater target recognition, and the average detection accuracy can reach 91.3%. The improved Mask R-CNN network model proposed in this paper can adapt to complex and changeable underwater scenes, providing a theoretical basis and technical solution for underwater target recognition.

2024,46(1): 143-147 收稿日期：2022-11-06

DOI：10.3404/j.issn.1672-7649.2024.01.024

分类号：TN912.34

作者简介：丁元明(1967-),男,教授,研究方向为水下信号处理

参考文献：
[1] 张弓. 基于卷积网络的水下目标识别研究[D]. 镇江: 江苏科技大学, 2020.
[2] FEIZENSZWALB P F, GIRSHICK R B, MCALLESTER D, et al. Object detection with discriminatively trained part-based models[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(9): 1627-1645.
[3] LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-base learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86,(11): 2278-2324.
[4] 李旭冬, 叶茂, 李涛. 基于卷积神经网络的目标检测研究综述[J]. 计算机应用研究, 2017, 34(10): 2881-2886.
[5] AGARWAL S, DU TERRAIL J O, JURIE F. Recent advances in object detection in the age of deep convolutional neural networks[J]. Computer Vision and Pattern Recognition 2018, ar- Xiv: 1809. 03193.
[6] LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single shot multibox detector [C]// Proceedings of the 2016 European Conference on Computer Vision, 2016: 21-37.
[7] REDMON J, DIVVALA S, CIRSHICK R, et al. You only look once: Unified, real-time object detection [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016: 779-788.
[8] LIU W, ANGUELOV D, ERHAN D, et al. Ssd: Single shot multibox detector[C]// European Conference on Computer Vision, 2016: 21-37.
[9] CHEN L C, PAPANDREOU G, KOKKINOS I, et al. Semantic image segmentation with deep convolutional nets and fully connected CRFs[J]. Computer Vision and Pattern Recognition, 2016, arXiv: 1412. 7062.
[10] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towa- rds reatime object detection with region proposal netwoks[C]// Proceedings of the 28^th International Conference on Neural In- formation Processing Systems, 2015: 91-99.
[11] SZEGEDY C, LIU W, JIA Y, et al. Going deeper with convolutions[C]// IEEE Conference on Computer Vision and Pattern Recognition, 2015.
[12] 赵丹, 刘洁瑜, 沈强. 一种改进的多门控特征金字塔网络[J]. 光学学报, 2019, 39(8): 235−244
[13] ZHANG Hu, ZU KeKe, LIU Jian, et al. EPSANet: an efficient pyramid split attention block on convolutional neural network network[J]. Computer Vision and Pattern Recognition, 2021, 2105. 14447.
[14] HU Jie, SHEN Li, ALBANIE S, et al. Squeeze-and- excitation network[J]. IEEE Transactions on Pattern Analysis & MachineIntelligence, 2020, 42(8): 2011-2023.
[15] WANG Qilong, WU Banggu, ZHU Pengfei, et al. ECA -Net: Efficient channel attention for deep convolutional neural networks[C]//2020 IEEE/CVF Conference on Computer Vision Pattern Recognition, Seattle, USA, 2020: 115311-11539.

基于深度学习的水下目标识别技术 Underwater target recognition technology based on deep learning

基于深度学习的水下目标识别技术
Underwater target recognition technology based on deep learning