【24h】

A clustering with slope algorithm based on item similarity

机译:基于项目相似度的坡度聚类算法

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

With the development of the internet and the arrival of large volumes of data, the analysis of transactional data is becoming important in the field of data mining. Clustering algorithms for transactional trade datasets are becoming a hot topic. Among them, clustering with slope algorithm (CLOPE) is widely used as a result of its superior performance, lower memory use, and better quality than other clustering algorithms. However, the quality of the CLOPE algorithm is related to the sequence in which the data is input; different result will be clustered by different input sequences of the same dataset. This can even result in poor clustering. In order to solve the problem, this paper analyzes the CLOPE algorithm deeply and proves that records with more items ahead will improve the quality of the result greatly in theory. A procedure to preprocess the dataset according to item similarity is proposed. The experiment results show that the algorithm has obviously better quality result when the proposed method is used, and it is 10% faster than the traditional procedure. This algorithm is a valid algorithm that produces high quality results for transaction data sets.
机译:随着互联网的发展和大量数据的到来,对交易数据的分析在数据挖掘领域变得越来越重要。交易数据集的聚类算法正成为热门话题。其中,与其他聚类算法相比,斜率聚类(CLOPE)具有优越的性能,较低的内存使用量和更好的质量,因此被广泛使用。但是,CLOPE算法的质量与数据输入的顺序有关。不同的结果将由同一数据集的不同输入序列聚类。这甚至可能导致不良的群集。为了解决该问题,本文对CLOPE算法进行了深入的分析,并证明从理论上讲,更多项目的记录将大大提高结果的质量。提出了根据项目相似度对数据集进行预处理的过程。实验结果表明,该算法在使用本文提出的方法时具有明显更好的质量效果,比传统算法快10%。此算法是有效的算法,可以为交易数据集产生高质量的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号