首页> 外文会议>Asia-Pacific Bioinformatics Conference >A Bayesian approach for estimating allele-specific expression from RNA-Seq data with diploid genomes
【24h】

A Bayesian approach for estimating allele-specific expression from RNA-Seq data with diploid genomes

机译:一种贝叶斯方法,用于估算来自二倍体基因组的RNA-SEQ数据的等位基因特异性表达

获取原文

摘要

Background: RNA-sequencing (RNA-Seq) has become a popular tool fortranscriptome profiling in mammals. However, accurate estimation of allele-specific expression (ASE) based on alignments of reads to the reference genome is challenging, because it contains only one allele on a mosaic haploid genome. Even with the information of diploid genome sequences, precise alignment of reads to the correct allele is difficult because of the high-similarity between the corresponding allele sequences.Results: We propose a Bayesian approach to estimate ASE from RNA-Seq data with diploid genome sequences. In th e statistical framework, the haploid choice is modeled as a hidden variable and estimated simultaneously with isoform e xpression levels by variational Bayesian inference. Through the simulation data analysis, we demonstrate the effectivenes s of the proposed approach in terms of identifying ASE compared to the existing approach. We also show that our approach enables better quantification of isoform expression levels compared to the existing methods, TIGAR2, RSEM and Cufflinks. In the real data analysis of the human reference lymphoblastoid cell line GM12878, some autosomal genes were identified as ASE genes, and skewed paternal X-chromosomeinactivation in GM12878 was identified.Conclusions: The proposed method, called ASE-TIGAR, enables accurate estimation of gene expression from RNA-Seq data in an allele-specific manner. Our results show the effectiveness of utilizing personal genomic information for accurate estimation of ASE. An implementation of our method is available at http://nagasakilab.csml.org/ase-tigar.
机译:背景:RNA测序(RNA-SEQ)已成为哺乳动物中的流行工具。然而,基于读取对参考基因组的对准的等位基因特异性表达(ASE)的精确估计是具有挑战性的,因为它仅在马赛克单倍体基因组上仅包含一个等位基因。即使是二倍体基因组序列的信息,由于相应的等位基因序列之间的高相似性,读取对正确等位基因的精确对准也是困难的。结果:我们提出了一种贝叶斯方法来估计来自二倍体基因组序列的RNA-SEQ数据的ASE。 。在E统计框架中,单倍体选择被建模为隐藏变量,并通过变分贝叶斯推理与同种型E xpression水平同时估计。通过模拟数据分析,我们在与现有方法相比,在识别ASE方面展示了所提出的方法的有效性S.我们还表明,与现有方法,Tigar2,RSEM和袖扣相比,我们的方法能够更好地定量同种型表达水平。在人参考淋巴细胞线GM12878的真实数据分析中,将一些常染色体基因鉴定为ASE基因,并鉴定GM12878中的偏心X-染色体灭活。结论:所谓的方法,称为ASE-Tigar,可以准确地估计基因以等级特异性方式从RNA-SEQ数据的表达。我们的结果表明了利用个人基因组信息来准确估算ASE的有效性。我们的方法的实施是可在http://nagasakilab.csml.org/ase-tigar获取的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号