首页> 外文期刊>Genomics & Informatics >An Efficient Approach to Mining Maximal Contiguous Frequent Patterns from Large DNA Sequence Databases
【24h】

An Efficient Approach to Mining Maximal Contiguous Frequent Patterns from Large DNA Sequence Databases

机译:从大型DNA序列数据库中挖掘最大连续频率模式的有效方法

获取原文
           

摘要

Mining interesting patterns from DNA sequences is one of the most challenging tasks in bioinformatics and computational biology. Maximal contiguous frequent patterns are preferable for expressing the function and structure of DNA sequences and hence can capture the common data characteristics among related sequences. Biologists are interested in finding frequent orderly arrangements of motifs that are responsible for similar expression of a group of genes. In order to reduce mining time and complexity, however, most existing sequence mining algorithms either focus on finding short DNA sequences or require explicit specification of sequence lengths in advance. The challenge is to find longer sequences without specifying sequence lengths in advance. In this paper, we propose an efficient approach to mining maximal contiguous frequent patterns from large DNA sequence datasets. The experimental results show that our proposed approach is memory-efficient and mines maximal contiguous frequent patterns within a reasonable time.
机译:从DNA序列中挖掘有趣的模式是生物信息学和计算生物学中最具挑战性的任务之一。为了表达DNA序列的功能和结构,优选最大连续的频繁模式,因此可以捕获相关序列之间的共同数据特征。生物学家有兴趣寻找频繁有序排列的基序,这些基序负责一组基因的相似表达。但是,为了减少挖掘时间和复杂性,大多数现有的序列挖掘算法要么着重于寻找短的DNA序列,要么需要事先明确指定序列长度。挑战是要找到更长的序列而不预先指定序列长度。在本文中,我们提出了一种从大型DNA序列数据集中挖掘最大连续频繁模式的有效方法。实验结果表明,我们提出的方法具有较高的存储效率,并且可以在合理的时间内挖掘出最大的连续频繁模式。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号