【24h】

Item Set Mining Based on Cover Similarity

机译:项目设置挖掘基于封面相似性

获取原文

摘要

While in standard frequent item set mining one tries to find item sets the support of which exceeds a user-specified threshold (minimum support) in a database of transactions, we strive to find item sets for which the similarity of their covers (that is, the sets of transactions containing them) exceeds a user-specified threshold. Starting from the generalized Jaccard index we extend our approach to a total of twelve specific similarity measures and a generalized form. We present an efficient mining algorithm that is inspired by the well-known Eclat algorithm and its improvements. By reporting experiments on several benchmark data sets we demonstrate that the runtime penalty incurred by the more complex (but also more informative) item set assessment is bearable and that the approach yields high quality and more useful item sets.
机译:虽然在标准频繁的项目集挖掘一个尝试找到项目设置的支持,其支持超过了交易数据库中的用户指定的阈值(最小支持),我们努力查找其封面相似性的项目集(即,包含它们的事务集超过了用户指定的阈值。从广义Jaccard索引开始,我们将我们的方法扩展到总共十二个特定的相似度措施和广义形式。我们提出了一种有效的挖掘算法,其灵感来自着名的Eclat算法及其改进。通过在几个基准数据集上报告实验,我们证明了更复杂(但也是更具信息性)项目设置评估所产生的运行时惩罚是可以忍受的,并且该方法产生高质量和更有用的项目集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号