舰船通信网络数据量的显著提升对数据处理性能提出更高要求,为提升数据处理效率,研究基于hadoop的舰船通信网络数据并行处理方法。设计由数据应用层、数据处理层和数据存储层共同组成的基于hadoop的舰船通信网络数据并行处理架构:数据应用层作为用户与数据处理架构的交互工具,将所采集的数据上传至架构内;数据处理层运行MapReduce程序实现数据存储、解析与聚类等并行化处理;数据存储层采用HBase与HDFs等多种不同的存储方式存储舰船通信网络数据。实验结果显示,该方法可实现准确的舰船通信网络数据聚类,大幅节省数据处理时间,在数据量较大的条件下具备较好的数据处理加速比。
The significant improvement of the data volume of the ship communication network puts forward higher requirements for the data processing performance of the ship communication network. In order to improve the data processing efficiency, the parallel processing method of the ship communication network data based on hadoop is studied. Design a parallel processing architecture of ship communication network data based on hadoop, which is composed of ship communication network data application layer, ship communication network data processing layer and ship communication network data storage layer. The ship communication network data application layer is used as an interactive tool between the user and the data processing architecture to upload the collected ship communication network data into the architecture. The ship communication network data processing layer runs MapReduce program to realize parallel processing of ship communication network data storage, analysis and clustering. The data storage layer of ship communication network uses HBase, HDFs and other different storage methods to store ship communication network data. The experimental results show that this method can achieve accurate data clustering of ship communication network, greatly save data processing time, and have better data processing acceleration ratio under the condition of large data volume.
2023,45(7): 158-161 收稿日期:2022-10-23
DOI:10.3404/j.issn.1672-7649.2023.07.030
分类号:TP391
基金项目:长治学院校级教学改革创新项目(JC202012)
作者简介:赵健(1986-),男,硕士,讲师,研究方向为数据挖掘、云计算及软件工程