首页> 外文会议>IEEE International Workshop on Genomic Signal Processing and Statistics >RECONSTRUCTING LATENT PERIODS IN GENOME SEQUENCES WITH INSERTIONS AND DELETIONS
【24h】

RECONSTRUCTING LATENT PERIODS IN GENOME SEQUENCES WITH INSERTIONS AND DELETIONS

机译:用插入和缺失重建基因组序列中的潜在时期

获取原文

摘要

Tandem and latent repeats in genome sequences provide insight into its various structural and functional roles. Such regions in genome sequences are modeled as cyclostationary processes, generated by a collection of information sources in a cyclic manner. The maximum likelihood (ML) estimates can be easily generated for the cyclostationary profiles and for the statistical period of such subsequences. However, in the presence of insertions and deletions, the ML estimators suffer greatly in their ability to accurately identify the periods. This paper extends the cyclic model to a profile hidden Markov model (PHMM) to account for insertions and deletions. An iterative algorithm is developed to learn parameters of the PHMM and Viterbi algorithm is employed to learn the most likely path through the state space. This reconstructs likely insertions and deletions in the sequence and results in better estimates of the statistical period and cyclostationary profiles than the ML approach. Experimental results are provided with simulated sequences as well as with chromosome 1 sequence from human genome.
机译:基因组序列中的串联和潜在的重复提供了洞察其各种结构和功能作用的洞察力。基因组序列中的这些区域被建模为循环过程,由循环方式由信息来源集合产生。可以容易地为卷曲曲线和这种子序列的统计周期容易地生成最大可能性(ML)估计。然而,在存在插入和缺失的情况下,ML估计值在准确识别时段的能力中受到极大的影响。本文将循环模型扩展到配置文件隐马尔可夫模型(PHMM),以解释插入和删除。开发了一种迭代算法以学习PHMM的参数,使用维特比算法来学习通过状态空间的最可能路径。这在序列中重建了可能的插入和缺失,并导致统计周期和裂纹型曲线的估计比ML方法更好。实验结果具有模拟序列以及来自人类基因组的染色体1序列。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号