【24h】

Repeats and correlations in human DNA sequences - art. no. 061913

机译:人类DNA序列中的重复和相关-艺术。没有。 061913

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

We study the nucleotide-nucleotide mutual information function I(k) of the DNA sequences of the three completely sequenced human chromosomes 20, 21, and 22. We find in each human chromosome (i) the absence of the k=3 base pair (bp) sequence periodicity characteristic for protein coding regions, (ii) the absence of the k=10-11 bp sequence periodicity characteristic for both protein secondary structure and DNA bendability, and (iii) the presence of significant statistical dependencies at about k=135 bp and at about k=165 bp. We investigate to which degree the density and composition of interspersed repeats might explain these observed statistical patterns in all three human chromosomes. We use simple stochastic models to substitute known interspersed repeats and find by numerical studies that (iv) the presence of interspersed repeats dominates short-range correlations as measured by I(k) on the scale of several hundred base pairs in human chromosomes 20, 21, and 22. On the other hand, we find that (v) interspersed repeats contribute only weakly to long-range correlations due to the clustering of highly abundant Alu repeats. [References: 69]
机译:我们研究了三个完全测序的人类染色体20、21和22的DNA序列的核苷酸-核苷酸互信息功能I(k)。我们在每个人类染色体中发现(i)不存在k = 3个碱基对( bp)蛋白质编码区的序列周期性特征,(ii)不存在蛋白质二级结构和DNA易弯曲性的k = 10-11 bp序列周期性特征,以及(iii)在大约k = 135处存在显着的统计依赖性bp,大约k = 165 bp。我们调查散布的重复序列的密度和组成在多大程度上可以解释所有这三个人类染色体上观察到的统计模式。我们使用简单的随机模型替代已知的散布重复序列,并通过数值研究发现(iv)散布重复序列的存在主导着短时相关性,如I(k)在人类20、21号染色体上数百个碱基对的尺度上所测得的那样。 ,和22。另一方面,由于高度丰富的Alu重复序列的聚类,我们发现(v)散布的重复序列对远程相关性的贡献很小。 [参考:69]

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号