【24h】

An Improved Online Stream Data Clustering Algorithm

机译:一种改进的在线流数据聚类算法

获取原文
获取原文并翻译 | 示例

摘要

The stream data mining is a hot research topic in recent years. In order to improve the efficiency of stream data mining, this paper designs an online stream data clustering algorithm IStrAP. IStrAP considers the features of stream data, such as potentially infinity, rapidness, and inability to scan historical data repeatedly, and introduces a method of eliminating outliers to the existing algorithm StrAP. IStrAP does statistical analysis of the data in reservoir (a temporary storage area) to get the statistics and the parameters that can reflect the data characteristics, removes the abnormal data from the reservoir according to the statistical properties, and then clusters the residuary data in the reservoir. The experimental results show that IStrAP can effectively eliminate outliers, and it not only has higher clustering accuracy and lower time complexity than existing StrAP algorithm, but also has better dynamic adaptability for the stream data.
机译:流数据挖掘是近年来研究的热点。为了提高流数据挖掘的效率,设计了一种在线流数据聚类算法IStrap。 IStrAP考虑了流数据的特性,例如潜在的无限性,快速性和无法重复扫描历史数据,并为现有算法StrAP引入了一种消除异常值的方法。 IStrAP对存储库(临时存储区)中的数据进行统计分析,以获得可以反映数据特征的统计信息和参数,根据统计属性从存储库中删除异常数据,然后将剩余数据聚类到存储库中。水库。实验结果表明,IStrAP可以有效地消除离群值,与现有的StrAP算法相比,不仅具有较高的聚类精度和较低的时间复杂度,而且对流数据具有更好的动态适应性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号