首页> 外文期刊>Nucleic acids research >WORDUP: an efficient algorithm for discovering statistically significant patterns in DNA sequences
【24h】

WORDUP: an efficient algorithm for discovering statistically significant patterns in DNA sequences

机译:WORDUP:一种有效的算法,可发现DNA序列中具有统计学意义的模式

获取原文

摘要

We present here a fast and sensitive method designed to isolate short nucleotide sequences which have non-random statistical properties and may thus be biologically active. It is based on a first order Markov analysis and allows us to detect statistically significant sequence motifs from six to ten nucleotides long which are significantly shared (or avoided) in the sequences under investigation. This method has been tested on a set of 521 sequences extracted from the Eukaryotic Promoter Database (2). Our results demonstrate the accuracy and the efficiency of the method in that the sequence motifs which are known to act as eukaryotic promoters, such as the TATA-box and the CAAT-box, were clearly identified. In addition we have found other statistically significant motifs, the biological roles of which are yet to be clarified.
机译:我们在这里提出了一种快速,灵敏的方法,旨在分离具有非随机统计特性的短核苷酸序列,因此可能具有生物学活性。它基于一阶马尔可夫分析,使我们能够检测到6至10个核苷酸长的统计学上显着的序列基序,这些基序在所研究的序列中显着共享(或避免了)。该方法已在从真核启动子数据库(2)中提取的521个序列中进行了测试。我们的结果证明了该方法的准确性和效率,因为清楚地鉴定了已知充当真核启动子的序列基序,例如TATA-box和CAAT-box。此外,我们还发现了其他具有统计学意义的基序,其生物学作用尚待阐明。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号