首页> 外文会议>Combinatorial Pattern Matching >Optimal Spaced Seeds for Hidden Markov lodels, with Application to Homologous Coding Regions
【24h】

Optimal Spaced Seeds for Hidden Markov lodels, with Application to Homologous Coding Regions

机译:隐马尔可夫模型的最佳间隔种子及其在同源编码区域的应用

获取原文
获取外文期刊封面目录资料

摘要

We study the problem of computing optimal spaced seeds for detecting sequences generated by a Hidden Markov model. Inspired by recent work in DNA sequence alignment, we have developed such a model for representing the conservation between related DNA coding sequences. Our model includes positional dependencies and periodic rates of conservation, as well as regional deviations in overall conservation rate. We show that, for hidden Markov models in general, the probability that a seed is matched in a region can be computed efficiently, and use these methods to compute the optimal seed for our models. Our experiments on real data show that the optimal seeds are substantially more sensitive than the seeds used in the standard alignment program BLAST, and also substantially better than those of PatternHunter or WABA, both of which use spaced seeds. Our results offer the hope of improved gene finding due to fewer missed exons in DNA/DNA comparison, and more effective homology search in general, and may have applications outside of bioinformatics.
机译:我们研究了计算最佳间隔种子以检测由隐马尔可夫模型生成的序列的问题。受DNA序列比对方面最新工作的启发,我们已经开发出了一种用于表示相关DNA编码序列之间保守性的模型。我们的模型包括位置相关性和周期性养护率,以及总体养护率的区域偏差。我们表明,对于一般的隐马尔可夫模型,可以有效地计算出种子在区域中匹配的概率,并使用这些方法来为我们的模型计算最佳种子。我们在真实数据上的实验表明,最佳种子比标准比对程序BLAST中使用的种子更加敏感,并且也比PatternHunter或WABA的种子(它们都使用间隔种子)要好得多。我们的结果提供了希望改进基因的希望,因为在DNA / DNA比较中漏掉的外显子更少,并且一般而言更有效的同源性搜索,并且可能在生物信息学领域之外得到应用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号