...
首页> 外文期刊>Bioinformatics >A statistical approach for inferring the 3D structure of the genome
【24h】

A statistical approach for inferring the 3D structure of the genome

机译:推断基因组3D结构的统计方法

获取原文
获取原文并翻译 | 示例
           

摘要

Motivation: Recent technological advances allow the measurement, in a single Hi-C experiment, of the frequencies of physical contacts among pairs of genomic loci at a genome-wide scale. The next challenge is to infer, from the resulting DNA-DNA contact maps, accurate 3D models of how chromosomes fold and fit into the nucleus. Many existing inference methods rely on multidimensional scaling (MDS), in which the pairwise distances of the inferred model are optimized to resemble pairwise distances derived directly from the contact counts. These approaches, however, often optimize a heuristic objective function and require strong assumptions about the biophysics of DNA to transform interaction frequencies to spatial distance, and thereby may lead to incorrect structure reconstruction. Methods: We propose a novel approach to infer a consensus 3D structure of a genome from Hi-C data. The method incorporates a statistical model of the contact counts, assuming that the counts between two loci follow a Poisson distribution whose intensity decreases with the physical distances between the loci. The method can automatically adjust the transfer function relating the spatial distance to the Poisson intensity and infer a genome structure that best explains the observed data. Results: We compare two variants of our Poisson method, with or without optimization of the transfer function, to four different MDSbased algorithms-two metric MDS methods using different stress functions, a non-metric version of MDS and ChromSDE, a recently described, advanced MDS method-on a wide range of simulated datasets. We demonstrate that the Poisson models reconstruct better structures than all MDS-based methods, particularly at low coverage and high resolution, and we highlight the importance of optimizing the transfer function. On publicly available Hi-C data from mouse embryonic stem cells, we show that the Poisson methods lead to more reproducible structures than MDS-based methods when we use data generated using different restriction enzymes, and when we reconstruct structures at different resolutions.
机译:动机:最近的技术进步允许在单个Hi-C实验中测量全基因组范围的成对基因组位点之间的物理接触频率。下一个挑战是从产生的DNA-DNA接触图推断出染色体如何折叠并适合细胞核的准确3D模型。许多现有的推论方法都依赖于多维缩放(MDS),其中对推论模型的成对距离进行了优化,使其类似于直接从接触计数得出的成对距离。然而,这些方法经常优化启发式目标函数,并且需要关于DNA的生物物理学的强有力假设,才能将相互作用频率转换为空间距离,从而可能导致错误的结构重建。方法:我们提出了一种从Hi-C数据推断基因组共有3D结构的新颖方法。该方法合并了接触计数的统计模型,假设两个基因座之间的计数遵循泊松分布,其强度随基因座之间的物理距离而降低。该方法可以自动调节将空间距离与泊松强度相关的传递函数,并推断出最能解释观察到的数据的基因组结构。结果:我们将Poisson方法的两种变体(无论是否优化了传递函数)与四种不同的基于MDS的算法进行了比较-使用不同应力函数的两种公制MDS方法,MDS和ChromSDE的非公制版本(最近描述),高级MDS方法-适用于各种模拟数据集。我们证明了泊松模型比所有基于MDS的方法重建的结构更好,尤其是在低覆盖率和高分辨率的情况下,并且我们强调了优化传递函数的重要性。在来自小鼠胚胎干细胞的公开可用Hi-C数据上,我们显示,当我们使用通过不同限制酶产生的数据以及以不同分辨率重建结构时,泊松方法比基于MDS的方法可产生更多可重现的结构。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号