首页> 外文期刊>PLoS Computational Biology >A Bayesian Framework to Identify Methylcytosines from High-Throughput Bisulfite Sequencing Data
【24h】

A Bayesian Framework to Identify Methylcytosines from High-Throughput Bisulfite Sequencing Data

机译:从高通量亚硫酸氢盐测序数据鉴定甲基胞嘧啶的贝叶斯框架

获取原文
           

摘要

High-throughput bisulfite sequencing technologies have provided a comprehensive and well-fitted way to investigate DNA methylation at single-base resolution. However, there are substantial bioinformatic challenges to distinguish precisely methylcytosines from unconverted cytosines based on bisulfite sequencing data. The challenges arise, at least in part, from cell heterozygosis caused by multicellular sequencing and the still limited number of statistical methods that are available for methylcytosine calling based on bisulfite sequencing data. Here, we present an algorithm, termed Bycom, a new Bayesian model that can perform methylcytosine calling with high accuracy. Bycom considers cell heterozygosis along with sequencing errors and bisulfite conversion efficiency to improve calling accuracy. Bycom performance was compared with the performance of Lister, the method most widely used to identify methylcytosines from bisulfite sequencing data. The results showed that the performance of Bycom was better than that of Lister for data with high methylation levels. Bycom also showed higher sensitivity and specificity for low methylation level samples (<1%) than Lister. A validation experiment based on reduced representation bisulfite sequencing data suggested that Bycom had a false positive rate of about 4% while maintaining an accuracy of close to 94%. This study demonstrated that Bycom had a low false calling rate at any methylation level and accurate methylcytosine calling at high methylation levels. Bycom will contribute significantly to studies aimed at recalibrating the methylation level of genomic regions based on the presence of methylcytosines.
机译:高通量亚硫酸氢盐测序技术提供了一种全面且适合的方法,以单碱基分辨率研究DNA甲基化。然而,基于亚硫酸氢盐测序数据将甲基胞嘧啶与未转化的胞嘧啶区分开来,存在很大的生物信息学挑战。这些挑战至少部分是由于多细胞测序引起的细胞杂合症以及基于亚硫酸氢盐测序数据进行甲基胞嘧啶调用的统计方法数量仍然有限。在这里,我们介绍一种称为Bycom的算法,该算法是一种新的贝叶斯模型,可以高精度地执行甲基胞嘧啶的调用。 Bycom认为细胞杂合以及测序错误和亚硫酸氢盐转化效率可提高通话准确性。将Bycom的性能与Lister的性能进行比较,该方法最广泛地用于从亚硫酸氢盐测序数据中鉴定甲基胞嘧啶。结果表明,对于高甲基化水平的数据,Bycom的性能优于Lister。与李斯特菌相比,Bycom对低甲基化水平样品(<1%)的敏感性和特异性也更高。基于减少的代表性亚硫酸氢盐测序数据的验证实验表明,Bycom的假阳性率约为4%,同时保持了接近94%的准确度。这项研究表明,Bycom在任何甲基化水平下均具有较低的错误检出率,而在高甲基化水平下则具有准确的甲基胞嘧啶检出率。 Bycom将为旨在基于甲基胞嘧啶的存在而重新校准基因组区域的甲基化水平的研究做出重大贡献。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号