为探究在无法获取充足图像数据样本的前提下,怎样发挥卷积神经网络图像识别的良好性能,针对训练数据集容量与卷积神经网络图像识别性能的关系进行深入研究。首先阐述了机器学习能够学习的条件,并根据VC Dimension理论推导出数据集容量与卷积神经网络参数量的关系,接着构建DigitNet与Cifar10Net网络模型,然后分别在不同容量的手写数字识别数据集及Cifar10数据集上训练模型并检验相应的训练模型的识别正确率,最后分析了实验结果是否符合推导的训练数据集容量与卷积神经网络参数量之间的关系。实验结果表明:卷积神经网络的图像识别性能与数据集容量之间存在着一定的关系,在满足卷积神经网络对数据集容量的最低要求时,卷积神经网络即可获取良好的图像识别性能。因此在无法获取海量数据集的情况下,采用卷积神经网络解决实际问题时,仅需要模型参数量10倍的训练数据容量为下限即可获取性能良好的网络模型。
In order to explore how to achieve good performance of image recognition with convolutional neural network under the premise of insufficient image data set, the relationship between data set capacity and image recognition by convolutional neural network performance is researched thoroughly. Firstly, explain the relation between data set capacity and the convolutional neural network parameter quantity based on the VC Dimension. Then build the DigitNet and Cifar10Net convolutional neural network, train the model on the different capacities of hand-written digital identification data sets and Cifar10 data sets. Lastly get recognition accuracy of the corresponding training model. Finally, it is analyzed that the relation between the training data set capacity and the convolutional neural network parameter amount. The experimental results show that there is a certain relationship between the image recognition performance of the convolutional neural network and the capacity of the data set. When satisfying the minimum requirement of the convolutional neural network for the capacity of the data set, the convolutional neural network can obtain good image recognition performance. Therefore, when a convolutional neural network is used to solve practical problems in the case that a large amount of data sets cannot be obtained, only 10 times the amount of training parameters of the model parameter is required as the lower limit to obtain a well-performing network model.
2019,41(11): 188-193 收稿日期:2018-11-29
DOI:10.3404/j.issn.1672-7649.2019.11.040
分类号:TP399.1
基金项目:海军装备部军内科研项目
作者简介:邢世宏(1985-),男,博士研究生,主要研究方向为计算机视觉,深度学习
参考文献:
[1] RUMELHART D E, HINTON G E, WILLIAMS R J. Leaning internal representations by back-propagating errors[J]. Nature, 1986, 323(6088):318-362
[2] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the Acm, 2012, 60(2):2012
[3] STANFORD VISION Lab S U, Princeton University ImageNet[EB/OL]. http://www.image-net.org/.
[4] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[J]., 2015:770-778
[5] OUYANG W, ZENG X, WANG X, et al. DeepID-Net:deformable deep convolutional neural networks for object detection[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2014, PP(99):1-1
[6] Labeled Faces in the Wild[EB/OL]. http://vis-www.cs.umass.edu/lfw/.
[7] GIRSHICK R, DONAHUE J, DARRELL T, et al. Region-based convolutional networks for accurate object detection and segmentation[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2016, 38(1):142
[8] GIRSHICK R. Fast R-CNN[C]//IEEE International Conference on Computer Vision, 2015:1440-1448.
[9] REN S, HE K, GIRSHICK R, et al. Faster R-CNN:towards real-time object detection with region proposal networks[C]//International Conference on Neural Information Processing Systems, 2015:91-99.
[10] DAI J, LI Y, HE K, et al. R-FCN:Object Detection via Region-based Fully Convolutional Networks[J], 2016.
[11] IANDOLA F N, HAN S, MOSKEWICZ M W, et al. SqueezeNet:AlexNet-level accuracy with 50x fewer parameters and <0.5 MB model size[J], 2016.
[12] HOWARD A G, ZHU M, CHEN B, et al. MobileNets:efficient convolutional neural networks for mobile vision applications[J], 2017.
[13] RASTEGARI M, ORDONEZ V, REDMON J, et al. XNOR-net:imagenet classification using binary convolutional neural networks[J]., 2016:525-542
[14] HAN S, MAO H, DALLY W J. Deep compression:compressing deep neural networks with pruning, trained quantization and huffman coding[J]. Fiber, 2015, 56(4):3-7
[15] SAU B B, BALASUBRAMANIAN V N. Deep model compression:distilling knowledge from noisy teachers[J], 2016.
[16] FIGURNOV M, IBRAIMOVA A, VETROV D, et al. PerforatedCNNs:acceleration through elimination of redundant convolutions[J]. Computer Science, 2015.
[17] HUBARA I, SOUDRY D, RAN E Y. Binarized Neural Networks[J], 2016.
[18] HOEFFDING W. Probability inequalities for sums of bounded random variables[J]. Publications of the American Statistical Association, 1994, 58(301):13-30
[19] VAPNIK V N, CHERVONENKIS A Y. On the uniform convergence of relative frequencies of events to their probabilities[J]. Theory of Probability & Its Applications, 1971, 17(2):264-280
[20] HUBEL D H, WIESEL T N. Receptive fields, binocular interaction and functional architecture in the cat's visual cortex[J]. Journal of Physiology, 1962, 60(1):106
[21] LECUN Y, BENGIO Y, HINTON G. Deep learning[J]. Nature, 2015, 521(7553):436-44
[22] LECUN Y. LeNet-5, convolutional neural networks[J].
[23] YANN LECUN C C, CHRISTOPHER J.C. Burges. THE MNIST DATABASE of handwritten digits[EB/OL]. http://yann.lecun.com/exdb/mnist/.
[24] ALEX KRIZHEVSKY V N, And GEOFFREY Hinton. The CIFAR-10 dataset[EB/OL]. http://www.cs.toronto.edu/~kriz/cifar.html.