首页> 外文会议>Annual international conference on research in computational molecular biology >A Bayesian Framework for Estimating Cell Type Composition from DNA Methylation Without the Need for Methylation Reference
【24h】

A Bayesian Framework for Estimating Cell Type Composition from DNA Methylation Without the Need for Methylation Reference

机译:贝叶斯框架,用于从DNA甲基化估计细胞类型组成,而无需甲基化参考

获取原文

摘要

Genome-wide DNA methylation levels measured from a target tissue across a population have become ubiquitous over the last few years, as methylation status is suggested to hold great potential for better understanding the role of epigenetics. Different cell types are known to have different methylation profiles. Therefore, in the common scenario where methylation levels are collected from heterogeneous sources such as blood, convoluted signals are formed according to the cell type composition of the samples. Knowledge of the cell type proportions is important for statistical analysis, and it may provide novel biological insights and contribute to our understanding of disease biology. Since high resolution cell counting is costly and often logistically impractical to obtain in large studies, targeted methods that are inexpensive and practical for estimating cell proportions are needed. Although a supervised approach has been shown to provide reasonable estimates of cell proportions, this approach leverages scarce reference methylation data from sorted cells which are not available for most tissues and are not appropriate for any target population. Here, we introduce BayesCCE, a Bayesian semi-supervised method that leverages prior knowledge on the cell type composition distribution in the studied tissue. As we demonstrate, such prior information is substantially easier to obtain compared to appropriate reference methylation levels from sorted cells. Using real and simulated data, we show that our proposed method is able to construct a set of components, each corresponding to a single cell type, and together providing up to 50% improvement in correlation when compared with existing reference-free methods. We further make a design suggestion for future data collection efforts by showing that results can be further improved using cell count measurements for a small subset of individuals in the study sample or by incorporating external data of individuals with measured cell counts. Our approach provides a new opportunity to investigate cell compositions in genomic studies of tissues for which it was not possible before.
机译:在过去的几年中,从人群中的目标组织测得的全基因组DNA甲基化水平变得无处不在,因为甲基化状态被认为具有巨大的潜力,可以更好地理解表观遗传学的作用。已知不同的细胞类型具有不同的甲基化曲线。因此,在从异类来源(例如血液)收集甲基化水平的常见情况下,会根据样品的细胞类型组成形成回旋信号。细胞类型比例的知识对于统计分析很重要,它可能提供新颖的生物学见解,并有助于我们对疾病生物学的理解。由于高分辨率的细胞计数是昂贵的,并且在大型研究中通常在逻辑上不切实际,因此需要廉价且实用的靶向方法来估算细胞比例。尽管已显示出一种监督方法可以合理估计细胞比例,但该方法利用了来自分选细胞的稀缺参考甲基化数据,这些数据不适用于大多数组织,也不适合任何目标人群。在这里,我们介绍BayesCCE,这是一种贝叶斯半监督方法,该方法利用了对研究组织中细胞类型组成分布的先验知识。正如我们证明的那样,与来自已分类细胞的适当参考甲基化水平相比,此类先验信息实质上更容易获得。使用真实的和模拟的数据,我们证明了我们提出的方法能够构造一组组件,每个组件对应于一个单元格类型,并且与现有的无参考方法相比,可最多提供50%的相关性改善。我们通过显示对研究样本中一小部分个体的细胞计数测量结果或通过将个体的外部数据与测量的细胞计数合并在一起,可以进一步改善结果,从而为将来的数据收集工作提供了设计建议。我们的方法为研究组织的基因组研究中的细胞组成提供了新的机会,而这在以前是不可能的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号