首页> 外文会议>Proceedings of the 2007 International Conference on Machine Learning and Cybernetics >A FAST METHOD FOR DETERMINING THE REPEAT PATTERN SIZE IN DNA SEQUENCES
【24h】

A FAST METHOD FOR DETERMINING THE REPEAT PATTERN SIZE IN DNA SEQUENCES

机译:快速测定DNA序列中重复码型大小的方法

获取原文

摘要

Tandem repeats occur frequently in the human genome.The functions of them are still largely unclear, but some of them have been shown to cause human disease, and have relationship with regulatory functions.Thus, detecting tandem repeats has considerable significance.Because of the undetermined length of repeat pattern and indels and substitutions existing in a tandem repeat, identifying a tandem repeat in genomic sequence data is a difficult task.In this paper, an efficient algorithm is proposed, which is based on the autoregressive (AR) model.We analyze residual errors of the AR model with different orders for a DNA sequence.According to changes of residual errors, we can determine whether a sequence contains a tandem repeat and what pattern size is.Examples show this algorithm can not only detect exact tandem repeats but also approximate ones.
机译:串联重复序列在人类基因组中经常发生,其功能仍不清楚,但其中一些已被证明可导致人类疾病并与调节功能有关,因此检测串联重复序列具有重要意义。在串联重复序列中存在重复序列的长度以及插入/替换的长度,在基因组序列数据中识别串联重复序列是一项艰巨的任务。本文提出了一种基于自回归(AR)模型的有效算法。 DNA序列具有不同阶次的AR模型的残差,根据残差的变化,我们可以确定一个序列是否包含串联重复序列以及模式大小是多少,示例表明该算法不仅可以检测到精确的串联重复序列,而且还可以检测出正确的串联重复序列近似的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号