首页> 外文会议>ACMKDD International Conference on Knowledge Discovery and Data Mining;KDD 2008 >Permu-pattern: Discovery of Mutable Permutation Patterns with Proximity Constraint
【24h】

Permu-pattern: Discovery of Mutable Permutation Patterns with Proximity Constraint

机译:Permu-pattern:发现具有邻近约束的可变排列模式

获取原文

摘要

Pattern discovery in sequences is an important problem in many applications, especially in computational biology and text mining. However, due to the noisy nature of data, the traditional sequential pattern model may fail to reflect the underlying characteristics of sequence data in these applications. There are two challenges: First, the mutation noise exists in the data, and therefore symbols may be misrepresented by other symbols; Secondly, the order of symbols in sequences could be permutated. To address the above problems, in this paper we propose a new sequential pattern model called mutable permutation patterns. Since the Apriori property does not hold for our permutation pattern model, a novel Permu-pattern algorithm is devised to mine frequent mutable permutation patterns from sequence databases. A reachability property is identified to prune the candidate set. Last but not least, we apply the permutation pattern model to a real genome dataset to discover gene clusters, which shows the effectiveness of the model. A large amount of synthetic data is also utilized to demonstrate the efficiency of the Permu-pattern algorithm.
机译:序列中的模式发现是许多应用程序中的重要问题,尤其是在计算生物学和文本挖掘中。但是,由于数据的嘈杂性,传统的顺序模式模型可能无法反映这些应用程序中序列数据的基本特征。存在两个挑战:首先,数据中存在突变噪声,因此,其他符号可能会误表示符号;其次,可以改变序列中符号的顺序。为了解决上述问题,在本文中,我们提出了一种新的顺序模式模型,称为可变排列模式。由于Apriori属性不适用于我们的排列模式模型,因此设计了一种新颖的Permu模式算法来从序列数据库中挖掘频繁的可变排列模式。确定可达性属性以修剪候选集。最后但并非最不重要的一点是,我们将置换模式模型应用于实际的基因组数据集以发现基因簇,这表明了该模型的有效性。大量的合成数据也被用来证明Permu模式算法的效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号