首页> 美国卫生研究院文献>Biophysical Journal >Correlation approach to identify coding regions in DNA sequences.
【2h】

Correlation approach to identify coding regions in DNA sequences.

机译:用于鉴定DNA序列编码区的相关方法。

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Recently, it was observed that noncoding regions of DNA sequences possess long-range power-law correlations, whereas coding regions typically display only short-range correlations. We develop an algorithm based on this finding that enables investigators to perform a statistical analysis on long DNA sequences to locate possible coding regions. The algorithm is particularly successful in predicting the location of lengthy coding regions. For example, for the complete genome of yeast chromosome III (315,344 nucleotides), at least 82% of the predictions correspond to putative coding regions; the algorithm correctly identified all coding regions larger than 3000 nucleotides, 92% of coding regions between 2000 and 3000 nucleotides long, and 79% of coding regions between 1000 and 2000 nucleotides. The predictive ability of this new algorithm supports the claim that there is a fundamental difference in the correlation property between coding and noncoding sequences. This algorithm, which is not species-dependent, can be implemented with other techniques for rapidly and accurately locating relatively long coding regions in genomic sequences.
机译:最近,观察到DNA序列的非编码区具有远距离幂律相关性,而编码区通常仅显示短程相关性。我们基于此发现开发了一种算法,该算法使研究人员能够对长DNA序列进行统计分析以找到可能的编码区域。该算法在预测冗长编码区域的位置方面特别成功。例如,对于酵母染色体III的完整基因组(315,344个核苷酸),至少82%的预测对应于推定的编码区;该算法正确地识别了所有大于3000个核苷酸的编码区,92%的2000至3000个核苷酸长的编码区和79%的1000至2000个核苷酸之间的编码区。这种新算法的预测能力支持这样的主张,即编码序列和非编码序列之间的相关属性存在根本差异。该算法与物种无关,可以通过其他技术来实现,以快速,准确地定位基因组序列中相对较长的编码区。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号