首页> 外文期刊>Bioinformatics >Estimates of allele-specific expression in Drosophila with a single genome sequence and RNA-seq data
【24h】

Estimates of allele-specific expression in Drosophila with a single genome sequence and RNA-seq data

机译:用单个基因组序列和RNA-seq数据估计果蝇中的等位基因特异性表达

获取原文
获取原文并翻译 | 示例
       

摘要

Motivation: Genetic variation in cis-regulatory elements is an important cause of variation in gene expression. Cis-regulatory variation can be detected by using high-throughput RNA sequencing (RNA-seq) to identify differences in the expression of the two alleles of a gene. This requires that reads from the two alleles are equally likely to map to a reference genome(s), and that single-nucleotide polymorphisms (SNPs) are accurately called, so that reads derived from the different alleles can be identified. Both of these prerequisites can be achieved by sequencing the genomes of the parents of the individual being studied, but this is often prohibitively costly. Results: In Drosophila, we demonstrate that biases during read mapping can be avoided by mapping reads to two alternative genomes that incorporate SNPs called from the RNA-seq data. The SNPs can be reliably called from the RNA-seq data itself, provided any variants not found in high-quality SNP databases are filtered out. Finally, we suggest a way of measuring allele-specific expression (ASE) by crossing the line of interest to a reference line with a high-quality genome sequence. Combined with our bioinformatic methods, this approach minimizes mapping biases, allows poor-quality data to be identified and removed and aides in the biological interpretation of the data as the parent of origin of each allele is known. In conclusion, our results suggest that accurate estimates of ASE do not require the parental genomes of the individual being studied to be sequenced.
机译:动机:顺式调控元件的遗传变异是基因表达变异的重要原因。可以通过使用高通量RNA测序(RNA-seq)来识别基因的两个等位基因表达的差异来检测顺式调节变异。这要求来自两个等位基因的读段同样有可能定位到参考基因组,并且必须准确地调用单核苷酸多态性(SNP),以便可以识别源自不同等位基因的读段。这两个先决条件都可以通过对被研究个体的父母的基因组进行测序来实现,但这通常代价高昂。结果:在果蝇中,我们证明了通过将读物映射到两个备选基因组中可以避免读图过程中的偏倚,所述两个备选基因组包含从RNA序列数据中调用的SNP。只要过滤掉在高质量SNP数据库中找不到的任何变体,就可以从RNA-seq数据本身可靠地调用SNP。最后,我们提出了一种通过将感兴趣的谱系与具有高质量基因组序列的参考谱系交叉来测量等位基因特异性表达(ASE)的方法。结合我们的生物信息学方法,该方法可最大程度地减少定位偏倚,使劣质数据得以识别和清除,并且有助于将数据生物学解释为每个等位基因的来源。总之,我们的结果表明,ASE的准确估算不需要对被研究个体的父母基因组进行测序。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号