针对目前设备异常状态检测方法实际应用的局限性,本文提出基于两阶段聚类的检测方法。该方法在不需要大量的专业知识经验和历史数据的条件下就可以实现设备状态数据的异常检测和定位。首先采用三角线性极值法对获取的时间序列数据进行分段处理,利用一阶段聚类获取初始异常时间段数据,再进行二次聚类操作获取异常对应测点和时间段,较快速便捷实现异常在空间和时间维度上的精准定位。实验结果表明,该方法有效降低误检率的同时可以极大提高异常检测的准确率,具备较高的实际应用价值。
In view of the limitations of the current detection methods for equipment abnormal state, this paper proposes a detection method via two-stage clustering. This method can detect and locate the abnormal state data of equipment without a lot of professional knowledge and experience and historical data. Firstly, the trigonometric linear extremum method is used to segment the obtained time series data, and the initial abnormal time period data is obtained by one-stage clustering, and then the corresponding measuring points and time periods of the anomaly are obtained by the secondary clustering operation, which can realize the accurate positioning of the anomaly in the spatial and temporal dimensions quickly and conveniently. The experimental results show that the method can effectively reduce the false alarm rate and greatly improve the accuracy of anomaly detection, which has high practical application value.
2021,43(8): 163-168 收稿日期:2020-12-10
DOI:10.3404/j.issn.1672-7649.2021.08.032
分类号:TM76
作者简介:吴英友(1977-),男,硕士,高级工程师,研究方向为船舶总体优化设计与振动噪声控制
参考文献:
[1] KNNOR E M, NG R T. Algorithms for Mining Distance-Based Outliers in Large Datasets[C]. In: Proc. 24th Int. Conf. On Very Large Data Bases, New York, NY, 1998: 392−403.
[2] JIANG S Y, LI Q H, LI K L, et al. GLOF: a new approach for mining local outlier[C]// Machine Learning and Cybernetics, 2003 International Conference on. IEEE, 2003.
[3] LEONID PORTNOY, ELEAZAR ESKIN AND SALVATORE J. STOLFO. Intrusion detection with unlabeled data using clustering[C]. In: Proc of ACM CSS Workshop on Data Mining Applied to Security (DM-SA-2001). Philadelphia, PA, 2001.
[4] 王雷, 张瑞青, 盛伟, 等. 基于支持向量机的回归预测和异常数据检测[J]. 中国电机工程学报, 2009, 29(8): 92–96
[5] HE Z, XU X, DENG S. Discovering cluster-based local outliers[J]. Pattern Recognition Letters, 2003, 24(9–10): 1641–1650
[6] 庄池杰, 张斌, 胡军, 等. 基于无监督学习的电力用户异常用电模式检测[J]. 中国电机工程学报, 2016, 36(2): 379–387
[7] 李洪成, 吴晓平, 姜洪海. 基于改进聚类分析的网络流量异常检测方法[J]. 网络与信息安全学报, 2015, 1(1): 66–71
[8] HAWKINSD M. Identification of Outliers[M]. Pretoria, South Africa: Council for Scientific and Industrial Research, 1980.
[9] BREUNIG M M, KRIEGEL H P, NG R T, et al. LOF: Identifying Density-Based Local Outliers [C]// Acm Sigmod International Conference on Management of Data. ACM, 2000.
[10] LOU S, TANG D, ZENG W, ET AL. Application of Clustering Filter for Noise and Outlier Suppression in Optical Measurement of Structured Surfaces[J]. IEEE Transactions on Instrumentation and Measurement, 2020
[11] MA J, JIANG X, JIANG J, et al. Robust Feature Matching Using Spatial Clustering With Heavy Outliers[J]. IEEE Transactions on Image Processing, 2019
[12] JIANG M F, TSENG S S, SU C M. Two-phase clustering process for outliers detection[J]. Pattern Recognition Letters, 2001, 22(6–7): 691–700
[13] 田力, 向敏. 基于密度聚类技术的电力系统用电量异常分析算法[J]. 电力系统自动化, 2017, 41(5): 64–70
[14] 戴爱明, 高学东. 时间序列三角极值点线性分段算法[J]南昌航空大学学报(自然科学版), 2009(3): 116−121.
[15] DEMPSTER A P. Maximum likelihood from incomplete data via the EM algorithm[J]. Journal of the Royal Statal Society, 1977: 39
[16] 李娜, 钟诚. 基于划分和凝聚层次聚类的无监督异常检测[J]. 计算机工程, 2008(2): 120–123
[17] 周亚建, 徐晨, 李继国. 基于改进CURE聚类算法的无监督异常检测方法[J]. 通信学报, 2010, 31(7): 18–23