首页> 外文期刊>Concurrency and computation: practice and experience >An improved agglomerative hierarchical clustering anomaly detectionmethod for scientific data
【24h】

An improved agglomerative hierarchical clustering anomaly detectionmethod for scientific data

机译:一种改进的凝聚层次聚类异常检测方法,用于科学数据

获取原文
获取原文并翻译 | 示例

摘要

Anomaly detection tries to find out the data that disobeys the rule of majority data or expected patterns. The traditional hierarchical clustering algorithms have been adopted to detect anomaly, but have the disadvantages of low effectiveness and unstability. So we propose an improved agglomerative hierarchical clustering method for anomaly detection. It dynamically adjusts the optimum clustering number according to the self-defined criterion to save the trouble of manually picking clustering number, and determines the optimum clustering distance mode according to cophenetic correlation coefficient to reduce the procedure of manually testing the suitable distance mode in each iteration. The performances of proposed method are verified on tensile test, HTRU2 and credit card dataset. Compared with the traditional methods, our method possesses the most comprehensive performance (the highest F-measure with less iterations), which shows effectiveness of anomaly detection. And compared with the traditional methods (single, complete, average, and centroid mode), our method achieves the best performance on tensile test and HTRU2 dataset, showing stronger generalization. Compared with other methods (Decision + Gradient Boosted Tree, Decision Trees + Decision Stump, etc) on credit card dataset, our method obtains similar accuracy, and ranks in the top level in the aspect of sensitivity.
机译:异常检测试图找出残疾的数据,这些数据是多数数据或预期模式的规则。已采用传统的分层聚类算法来检测异常,但具有低效率和不稳定性的缺点。因此,我们提出了一种改进的异常检测附聚层次聚类方法。它根据自定义标准动态调整最佳聚类编号,以节省手动挑选聚类号码的麻烦,并根据焦核相关系数确定最佳聚类距离模式,以减少每次迭代中手动测试合适的距离模式的过程。在拉伸试验,HTRU2和信用卡数据集上验证了所提出的方法的性能。与传统方法相比,我们的方法具有最全面的性能(迭代较少的最高措施),其显示出异常检测的有效性。并与传统方法(单,齐全,平均和质心模式)相比,我们的方法实现了拉伸试验和HTRU2数据集的最佳性能,呈现出更强的概括。与其他方法(决定+渐变升级树,决策树+决策树立)上的其他方法相比,我们的方法获得了类似的准确性,并且在灵敏度方面的顶级排名。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号