针对水下图像存在细节模糊、多尺度以及识别模型计算资源大等问题,提出一种改进EfficientNet的图像识别模型。该模型通过迁移学习在公开数据集上训练得到初始模型参数,提出自适应参数化修正线性单元激活函数(Adaptively Parametric ReLU,APRelu)和基于选择性内核网络的注意力(Selective Kernel Network,SK)模块加强处理图像的细节特征和多尺度问题。通过保留所有MBConv6模块中的第一个Layer,并在最后一个MBConv6模块后嵌入BN和APRelu模块,加快其收敛速度并去除冗余特征。使用数据增强、十折交叉验证、快照集成等策略提高模型性能。实验对比表明,该模型在测试集上的准确率达到了97.32%,相对于改进前提高了3.75%,具有较高的识别性能。
An improved Efficientnet image recognition model was proposed to solve the problems of fuzzy details, multi-scale and large computing resources for underwater images. The initial model parameters were obtained by training the model on the open data set through transfer learning. Adaptive Parametric ReLU (APRelu) and Selective Kernel Network (SK) based modules are proposed to enhance the processing of image detail features and multi-scale problems. By keeping the first Layer in all MBConv6 modules and embedding the BN and APRelu module after the last MBConv6 module, it speeds up its convergence and removes redundant features. Improve model performance by using data enhancement, 10-fold cross-validation, snapshot integration and other strategies. The experimental comparison shows that the accuracy of the model on the test set reaches 97.32%, which is 3.75% percentage points higher than before the improvement, and has high recognition performance.
2024,46(15): 95-100 收稿日期:2023-07-11
DOI:10.3404/j.issn.1672-7649.2024.15.017
分类号:TP391.41
基金项目:国家自然科学基金资助项目(61901079)
作者简介:丁元明(1967 – ),男,博士,教授,研究方向为水下图像和信号处理
参考文献:
[1] 李昕蕾. 全球海洋环境危机治理: 机制演进、复合困境与优化路径[J]. 学术论坛, 2022, 45(2): 1-15.
[2] 巩文静, 田杰, 李宝奇, 等. 基于改进MobilenetV2网络的声光图像融合水下目标分类方法[J]. 应用声学, 2022, 41(3): 462-470.
[3] SIDDIQUI S A, SALMAN A, MALIK M I, et al. Automatic fish species classification in underwater videos: exploiting pre-trained deep neural network models to compensate for limited labelled data[J]. ICES Journal of Marine Science, 2018, 75(1): 374-389.
[4] QIN H, LI X, LIANG J, et al. Deepfish: accurate underwater live fish recognition with a deep architecture[J]. Neurocomputing, 2016, 187: 49-58.
[5] 甘雨, 郭庆文, 王春桃, 等. 基于改进EfficientNet模型的作物害虫识别[J]. 农业工程学报, 2022, 38(1): 203-211.
[6] CHENG Z, HUO G, LI H. A multi-domain collaborative transfer learning method with multi-scale repeated attention mechanism for underwater side-scan sonar image classification[J]. Remote Sensing, 2022, 14(2): 355.
[7] 邵剑飞, 魏榕剑, 温剑, 等. 融合SE和多尺度卷积的轻量级新冠肺炎分类模型[J/OL]. 云南大学学报(自然科学版): 1–8 [2023-04-24].
[8] TAN M, LE Q. Efficientnet: Rethinking model scaling for convolutional neural networks[C]//International conference on machine learning, PMLR, 2019: 6105–6114.
[9] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition, 2016: 770–778.
[10] 帖军, 隆娟娟, 郑禄, 等. 基于SK-EfficientNet的番茄叶片病害识别模型[J]. 广西师范大学学报(自然科学版), 2022, 40(4): 104-114.
[11] 江德港, 江智, 黄子杰, 等. 基于Efficientnet的无人机车辆目标检测算法[J/OL]. 计算机工程与应用: 1–11 [2023-04-26].
[12] ZHAO M, ZHONG S, FU X, et al. Deep residual networks with adaptively parametric rectifier linear units for fault diagnosis[J]. IEEE Transactions on Industrial Electronics, 2020, 68(3): 2587-2597.
[13] HE K, ZHANG X, REN S, et al. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification[C]//Proceedings of the IEEE international conference on computer vision, 2015: 1026–1034.
[14] SANDLER M, HOWARD A, ZHU M, et al. Mobilenetv2: Inverted residuals and linear bottlenecks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition, 2018: 4510–4520.
[15] LI X, WANG W, HU X, et al. Selective kernel networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 510–519.
[16] TRUONG Q T, LAUW H W. Vistanet: Visual aspect attention network for multimodal sentiment analysis[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2019, 33(1): 305–312.
[17] JIAN M, LIU X, LUO H, et al. Underwater image processing and analysis: a review[J]. Signal Processing: Image Communication, 2021, 91: 116088.
[18] SHORTEN C, KHOSHGOFTAAR T M. A survey on image data augmentation for deep learning[J]. Journal of Big Data, 2019, 6(1): 1-48.