针对水下环境复杂多变且视觉退化严重,现有的水下目标检测算法难以保证高精度实时检测水下目标且在复杂环境中的检测鲁棒性不足等问题,本文的算法架构基于YOLOv5改进,根据通用模型提出一种端到端的多尺度水下目标检测网络算法(UW-Net),完成复杂水下环境中高精度实时鲁棒检测水下目标任务。在特征提取部分,该网络通过稳定的底层特征提取模块和CSP-Net构建高精度轻量型特征提取结构,旨在保证网络实时性的同时提取更高维度的特征信息;在特征融合和检测部分,使用自适应特征融合机制和注意力增强方法在基本不影响检测速度的同时,提升算法的多尺度目标能力和检测鲁棒性,并通过K-means聚类方法自监督的实现最优锚框的标定从而实现目标区域的准确预知。实验结果表明:该方法在NVIDIA GeForce GTX1080Ti的GPU平台上对水下目标数据集的平均检测精度和检测速度分别为95.06%和139FPS,比YOLOv5s网络提升了2.87%和14FPS,实现了在实际复杂水下环境中高精度实时鲁棒地检测水下目标。
In view of the changing underwater environments and significant visual degradation, the existing underwater object detection algorithms are difficult to guarantee high-precision real-time detection of underwater objects and perform robustness detection in complex environments. This paper presents an end-to-end multi-scale underwater object detection network model (UW-Net) for detecting underwater objects in complex underwater environments with accuracy and efficiency. The network constructs a reliable and lightweight feature extraction structure through a stable underlying feature extraction module and CSP-Net, which aims to extract richer feature information during the real-time extraction. As part of the feature fusion and detection process, the adaptive feature fusion mechanism and the combined attention enhancement method are used to enhance the multi-scale object detection capability and the robustness of the algorithm without affecting its detection speed. Moreover, the self-supervised calibration of the optimal anchor frame is achieved through the K-means clustering method to achieve accurate prediction of the target area. The experimental results show that the detection accuracy and speed of the method on the underwater object dataset on GPU platform of NVIDIA GeForce GTX1080Ti are 95.06% and 139 FPS respectively, which are 2.87% and 14FPS higher than those of the YOLOv5s network and achieve the real-time and robust detection of underwater objects in complex environments with high-accuracy.
2023,45(22): 148-154 收稿日期:2022-10-25
DOI:10.3404/j.issn.1672-7649.2023.22.028
分类号:TP391.41
基金项目:海南省科技专项资助项目(ZDKJ2019002)
作者简介:葛锡云(1986-),男,硕士,高级工程师,研究方向为水下智能感知系统研究与设计
参考文献:
[1] CHANG R, WANG Y, HOU J, et al. Underwater object detection with efficient shadow removal for side scan sonar images[C]//Oceans, 2016: 1–5.
[2] LOWE D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 2004, 60(2): 91–110
[3] BAY H, TUYTELAARS T, GOOL L V. SURF: Speeded up robustfeatures[J]. Computer Vision & Image Understanding, 2006, 110(3): 404–417
[4] FREUND Y, SCHAPIRE R E. Experiments with a new boosting algorithm[C]//Proceedings of Thirteenth International Conference on International Conference on Machine Learning, 1996: 148–156.
[5] FABIC J N, TURLA I E, CAPACILLO J A, et al. Fish population estimation and species classification from underwater video sequences using blob counting and shape analysis[C]//Underwater Technology Symposium, 2013: 1–6
[6] OLIVER K, HOU W. Image feature detection and matching in underwater conditions[C]//Proceeding of Society of Photo-Optical Instrumentation Engineers, 2010.
[7] 丰子灏. 水下图像的兴趣目标检测[D]. 上海: 上海交通大学, 2015.
[8] 姚润璐, 桂詠雯, 黄秋桂. 基于机器视觉的淡水鱼品种识别[J]. 微型机与应用, 2017, 36(24): 37–39
[9] GIRSHICK, R. et al. Rich feature hierarchies for accurate object detection and semantic segmentation[J]. IEEE Conference on Computer Vision and Pattern Recognition, 2014: 580–587.
[10] GIRSHICK, R. Fast R-CNN, Proceedings of the IEEE International Conference on Computer Vision, 2015: 1440–1448.
[11] REN S HE H, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017(5): 137–1149.
[12] LIU W, ANGUELOV D, ERHAN D, et al. Ssd: Single shot multibox detector[J], in Proc. ECCV, Springer, Berlin, Germany, 2016, 21–37.
[13] REDMON J, et al. You only look once: unified, real-time object detection[J]. Conference on Computer Vision and Pattern Recognition, 2015: 779–788.
[14] REDMON J, FARHADI A. YOLO9000: Better, faster, stronger[J]. IEEE Conference on Computer Vision &Pattern Recognition, 2017: 6517–6525.
[15] REDMON J, FARHADI A. YOLOv3: An incremental improvement[J]. Conference on Computer Vision and Pattern Recognition, 2018.
[16] BOCHKOVSKIY A, WANG C -Y LIAO H Y M. YOLOv4: Optimal speed and accuracy of object detection[EB/OL]. Available: http://arxiv.org/abs/2004,10934.
[17] 王璐, 王雷欧, 王东辉. 基于Faster-rcnn的水下目标检测算法研究[J]. 网络新媒体技术, 2021, 10(5): 43–51
[18] 高岳. 融合GAN的Faster R-CNN水下遮挡目标检测方法研究与实现[D]. 银川: 宁夏大学, 2021.
[19] 刘腾, 徐成, 刘宏哲. 基于YOLOv3改进的水下目标检测[C]//中国计算机用户协会网络应用分会2021年第二十五届网络新技术与应用年会论文集., 2021: 159–162.
[20] 强伟, 贺昱曜, 郭玉锦, 等. 基于改进SSD的水下目标检测算法研究[J]. 西北工业大学学报, 2020, 38(4): 8
[21] 刘萍, 杨鸿波, 宋阳. 改进YOLOv3网络的海洋生物识别算法[J]. 计算机应用研究, 2020, 37(S1): 394–397
[22] FABBRI C, ISLAM M J, SATTAR J. Enhancing underwater imagery using generative adversarial networks[C]// Proceeding of IEEE International Conference on Robotics and Automation. Piscataway, NJ: IEEE Press, 2018: 7159–7165.
[23] LIU S, HUANG D, WANG Y. Learning spatial fusion for single-shot object detection[J]. Beijing: Beihang University, 2019.
[24] WANG Q, DU W, MA C, et al. Gradient color leaf Image segmentation algorithm based on meanshift and K-means[C]// 2021 IEEE 5th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), IEEE, 2021.