首页> 外文会议>IEEE international conference on data engineering >Quick-motif: An efficient and scalable framework for exact motif discovery
【24h】

Quick-motif: An efficient and scalable framework for exact motif discovery

机译:快速图案:一个有效且可扩展的框架,用于精确的图案发现

获取原文

摘要

Discovering motifs in sequence databases has been receiving abundant attentions from both database and data mining communities, where the motif is the most correlated pair of subsequences in a sequence object. Motif discovery is expensive for emerging applications which may have very long sequences (e.g., million observations per sequence) or the queries arrive rapidly (e.g., per 10 seconds). Prior works cannot offer fast correlation computations and prune subsequence pairs at the same time, as these two techniques require different orderings on examining subsequence pairs. In this work, we propose a novel framework named Quick-Motif which adopts a two-level approach to enable batch pruning at the outer level and enable fast correlation calculation at the inner level. We further propose two optimization techniques for the outer and the inner level. In our experimental study, our method is up to 3 orders of magnitude faster than the state-of-the-art methods.
机译:在序列数据库中发现基序已受到数据库和数据挖掘社区的广泛关注,其中基序是序列对象中最相关的子序列对。主题发现对于可能具有非常长的序列(例如,每个序列一百万次观察)或查询快速到达(例如,每10秒)的新兴应用程序来说是昂贵的。现有技术不能同时提供快速相关计算和修剪子序列对,因为这两种技术在检查子序列对时需要不同的顺序。在这项工作中,我们提出了一个名为Quick-Motif的新颖框架,该框架采用两级方法来实现外部级别的批处理修剪,并允许内部级别的快速相关计算。我们进一步针对外部和内部级别提出了两种优化技术。在我们的实验研究中,我们的方法比最先进的方法快3个数量级。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号