【24h】

Mining emerging substrings

机译:采矿新兴的子串

获取原文

摘要

We introduce a new type of KDD patterns called emerging substrings. In a sequence database, an emerging sub-string (ES) of a data class is a substring which occurs more frequently in that class rather than in other classes. ESs are important to sequence classification as they capture significant contrasts between data classes and provide insights for the construction of sequence classifiers. We propose a suffix tree-based framework for mining ESs, and study the effectiveness of applying one or more pruning techniques in different stages of our ES mining algorithm. Experimental results show that if the target class is of a small population with respect to the whole database, which is the normal scenario in single-class ES mining, most of the pruning techniques would achieve considerable performance gain.
机译:我们介绍了一种名为Emerging Substrings的新类型的KDD模式。在序列数据库中,数据类的新出现子字符串是子字符串,该子字符串在该类中更频繁地发生,而不是在其他类中发生。 ESS对序列分类非常重要,因为它们捕获数据类之间的显着对比并为序列分类器的构建提供了见解。我们提出了一种基于树木的挖掘框架,并研究了在ES采矿算法的不同阶段应用一种或多种修剪技术的有效性。实验结果表明,如果目标类是关于整个数据库的小人口,这是单级ES采矿中的正常情景,大多数修剪技术都会实现相当大的性能增益。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号