针对水下声学图像获取难度大、优质数据少的问题,提出一种参数高效微调Stable Diffusion的侧扫声呐图像数据集扩充方法。旨在提升数据集质量与多样性,进而改善基于深度学习的舰船侧扫声呐目标检测系统的性能。首先冻结预训练模型全连接层的权重,随后注入可训练的秩分解矩阵(Rank Decomposition Matrices),最后嵌入提示词生成图像样本。实验结果表明,相比于目前主流基于CycleGAN的方法,提出的方法生成了更高质量、多样化、稳定的侧扫声呐图像。此外,数据集扩充后,多种主流的目标检测算法性能均有所增强,YOLOv8n的mAP@0.5提升了22.9%,证明了该方法的有效性。
A parameter-efficient fine-tuning steady-state diffusion model for side scan sonar image dataset expansion is proposed to address the difficulty of obtaining underwater acoustic images and the lack of high-quality data. The aim is to enhance both the quality and diversity of the dataset, thereby improving the performance of the deep learning-based ship side-scan sonar target detection system. Firstly, freeze the weights of the fully connected layers of the pre-trained model. Then, trainable rank decomposition matrices are injected. Finally, embed prompt words to generate image samples. The experimental results show that compared to the current mainstream CycleGAN based methods, the proposed method achieves higher quality, diversity, and stability in generating side scan sonar images. In addition, after the expansion of the dataset, the performance of various mainstream target detection algorithms is improved, and the mAP@0.5 of YOLOv8n is improved by 22.9%, which proves the effectiveness of the proposed method.
2025,47(4): 137-142 收稿日期:2024-3-11
DOI:10.3404/j.issn.1672-7649.2025.04.022
分类号:U665.26
基金项目:云南师范大学博士启动基金资助项目(00900205020503127);云南师范大学研究生科研创新基金资助项目(YJSJJ23-B112)
作者简介:高鑫(1999-),男,硕士研究生,研究方向为基于深度学习的舰船声呐自动检测
参考文献:
[1] YU Y, ZHAO J, GONG Q, et al. Real-time underwater maritime object detection in side-scan sonar images based on transformer-YOLOv5[J]. Remote Sensing, 2021, 13(18): 3555.
[2] ZHU B, WANG X, CHU Z, et al. Active learning for recognition of shipwreck target in side-scan sonar image[J]. Remote Sensing, 2019, 11(3): 243.
[3] BURGUERA A, BONIN-FONT F. On-line multi-class segmentation of side-scan sonar imagery using an autonomous underwater vehicle[J]. Journal of Marine Science and Engineering, 2020, 8(8): 557.
[4] 葛慧林, 戴跃伟, 朱志宇, 等. 基于改进YOLOv7声光融合水下目标检测方法[J]. 舰船科学技术, 2023, 45(12): 122-127.
GE H L, DAI Y W, ZHU Z Y, et al. Research on acoustic-optical image fusion underwater target detection method based on improved YOLOv7[J]. Ship Science and Technology, 2023, 45(12): 122-127.
[5] 张家铭, 丁迎迎. 基于深度学习的声呐图像目标识别[J]. 舰船科学技术, 2020, 42(23): 133-136.
ZHANG J M, DING Y Y. Sonar image target recognition based on deep learning[J]. Ship Science and Technology, 2020, 42(23): 133-136.
[6] FUCHS L R, NORÉN A, JOHANSSON P. GAN-enhanced simulated sonar images for deep learning based detection and classification[C]//OCEANS 2022-Chennai. IEEE, 2022.
[7] LI C, YE X, CAO D, et al. Zero shot objects classification method of side scan sonar image based on synthesis of pseudo samples[J]. Applied Acoustics, 2021, 173: 107691.
[8] CHENG N, ZHAO T, CHEN Z, et al. Enhancement of underwater images by super-resolution generative adversarial networks[C]//Proceedings of the 10th International Conference on Internet Multimedia Computing and Service, 2018.
[9] SONG Y, HE B, LIU P, et al. Side scan sonar image segmentation and synthesis based on extreme learning machine[J]. Applied Acoustics, 2019, 146: 56-65.
[10] YANG D, WANG C, CHENG C, et al. Data generation with gan networks for sidescan sonar in semantic segmentation applications[J]. Journal of Marine Science and Engineering, 2023, 11(9): 1792.
[11] STEINIGER Y, KRAUS D, MEISEN T. Generating synthetic sidescan sonar snippets using transfer-learning in generative adversarial networks[J]. Journal of Marine Science and Engineering, 2021, 9(3): 239.
[12] LIU D, WANG Y, JI Y, et al. Cyclegan-based realistic image dataset generation for forward-looking sonar[J]. Advanced Robotics, 2021, 35(3-4): 242-254.
[13] 李宝奇, 黄海宁, 刘纪元, 等. 基于改进CycleGAN的光学图像迁移生成水下小目标合成孔径声纳图像算法研究[J]. 电子学报, 2021, 49(9): 1746-1753.
LI B Q , HUANG H N , LIU J Y , et al. Optical image-to-underwater small target synthetic aperture sonar image translation algorithm based on improved cyclegan[J]. Acta Electonica Sinica, 2021, 49(9): 1746-1753.
[14] 汤寓麟, 王黎明, 余德荧, 等. 基于CSLS-CycleGAN的侧扫声纳水下目标图像样本扩增法[J/OL]. 系统工程与电子技术, 1-16[2024-03-09].
TANG Y L, WANG L M, YU D Y, et al. A CSLS-CycleGAN based side-scan sonar sample augmentation method for underwater target images[J]. Systems Engineering and Electronics, 1-16[2024-03-09].
[15] ARORA S, RISTESKI A, ZHANG Y. Do GANs learn the distribution some theory and empirics[C]//International Conference on Learning Representations, 2018.
[16] YANG L, ZHANG Z, SONG Y, et al. Diffusion models: A comprehensive survey of methods and applications[J]. ACM Computing Surveys, 2023, 56(4): 1-39.
[17] ROMBACH R, BLATTMANN A, LORENZ D, et al. High-resolution image synthesis with latent diffusion models[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022.
[18] HU E J, SHEN Y, WALLIS P, et al. Lora: Low-rank adaptation of large language models[J]. arXiv preprint arXiv: 2106.09685, 2021.