首页> 美国卫生研究院文献>Genetics >A Novel Statistical Approach for Jointly Analyzing RNA-Seq Data from F1 Reciprocal Crosses and Inbred Lines
【2h】

A Novel Statistical Approach for Jointly Analyzing RNA-Seq Data from F1 Reciprocal Crosses and Inbred Lines

机译:联合分析来自F1反向杂交和自交系的RNA-Seq数据的新统计方法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

RNA sequencing (RNA-seq) not only measures total gene expression but may also measure allele-specific gene expression in diploid individuals. RNA-seq data collected from F1 reciprocal crosses in mice can powerfully dissect strain and parent-of-origin effects on allelic imbalance of gene expression. In this article, we develop a novel statistical approach to analyze RNA-seq data from F1 and inbred strains. Method development was motivated by a study of F1 reciprocal crosses derived from highly divergent mouse strains, to which we apply the proposed method. Our method jointly models the total number of reads and the number of allele-specific reads of each gene, which significantly boosts power for detecting strain and particularly parent-of-origin effects. The method deals with the overdispersion problem commonly observed in read counts and can flexibly adjust for the effects of covariates such as sex and read depth. The X chromosome in mouse presents particular challenges. As in other mammals, X chromosome inactivation silences one of the two X chromosomes in each female cell, although the choice of which chromosome to be silenced can be highly skewed by alleles at the X-linked X-controlling element (Xce) and stochastic effects. Our model accounts for these chromosome-wide effects on an individual level, allowing proper analysis of chromosome X expression. Furthermore, we propose a genomic control procedure to properly control type I error for RNA-seq studies. A number of these methodological improvements can also be applied to RNA-seq data from other species as well as other types of next-generation sequencing data sets. Finally, we show through simulations that increasing the number of samples is more beneficial than increasing the library size for mapping both the strain and parent-of-origin effects. Unless sample recruiting is too expensive to conduct, we recommend sequencing more samples with lower coverage.
机译:RNA测序(RNA-seq)不仅可以测量总基因表达,还可以测量二倍体个体中的等位基因特异性基因表达。从小鼠F1反向杂交收集的RNA-seq数据可以有效剖析品系和母本对基因表达等位基因失衡的影响。在本文中,我们开发了一种新颖的统计方法来分析F1和近交菌株的RNA序列数据。方法的发展是由对高度发散的小鼠品系衍生的F1倒数杂交的研究推动的,我们将所提出的方法应用于该研究。我们的方法共同模拟了每个基因的读取总数和等位基因特异性读取的数目,这大大提高了检测菌株的能力,尤其是起源于母体的效应。该方法解决了在阅读次数中通常观察到的过度分散问题,并且可以灵活地调整诸如性别和阅读深度之类的协变量的影响。小鼠中的X染色体提出了特殊的挑战。与其他哺乳动物一样,X染色体失活会使每个雌性细胞中的两个X染色体之一沉默,尽管要沉默的染色体的选择可能会被X连锁X控制元件(Xce)的等位基因和随机效应高度扭曲。 。我们的模型在个体水平上解释了这些染色体范围的影响,从而允许对X染色体表达的正确分析。此外,我们提出了一种基因组控制程序来适当控制RNA-seq研究的I型错误。这些方法学改进中的许多方法也可以应用于其他物种的RNA-seq数据以及其他类型的下一代测序数据集。最后,我们通过仿真显示,增加样本数量比增加文库大小更有利于映射应变和原产地效应。除非招募样本太昂贵而无法进行,否则我们建议对覆盖率较低的更多样本进行测序。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号