首页> 外文期刊>Scientific reports. >Identifying inaccuracies in gene expression estimates from unstranded RNA-seq data
【24h】

Identifying inaccuracies in gene expression estimates from unstranded RNA-seq data

机译:从未加入的RNA-SEQ数据识别基因表达估计中的不准确性

获取原文
获取外文期刊封面目录资料

摘要

RNA-seq methods are widely utilized for transcriptomic profiling of biological samples. However, there are known caveats of this technology which can skew the gene expression estimates. Specifically, if the library preparation protocol does not retain RNA strand information then some genes can be erroneously quantitated. Although strand-specific protocols have been established, a significant portion of RNA-seq data is generated in non-strand-specific manner. We used a comprehensive stranded RNA-seq dataset of 15 blood cell types to identify genes for which expression would be erroneously estimated if strand information was not available. We found that about 10% of all genes and 2.5% of protein coding genes have a two-fold or higher difference in estimated expression when strand information of the reads was ignored. We used parameters of read alignments of these genes to construct a machine learning model that can identify which genes in an unstranded dataset might have incorrect expression estimates and which ones do not. We also show that differential expression analysis of genes with biased expression estimates in unstranded read data can be recovered by limiting the reads considered to those which span exonic boundaries. The resulting approach is implemented as a package available at https://github.com/mikpom/uslcount .
机译:RNA-SEQ方法广泛用于生物样品的转录组谱。然而,已知该技术的警告可以偏离基因表达估计。具体地,如果文库制备方案不保留RNA链信息,则可以错误地定量一些基因。尽管已经建立了阶段特异性方案,但在非链状的方式产生了大部分RNA-SEQ数据。我们使用了15种血细胞类型的综合链RNA-SEQ数据集,以识别哪些基因,如果股线信息不可用,则会错误地估计表达式。我们发现,当读取读取的链信息被忽略时,约有10%的蛋白质编码基因的蛋白质编码基因的2.5%具有两倍或更高的差异。我们使用了这些基因的读取对齐参数来构建机器学习模型,可以识别未加工数据集中的哪个基因可能具有错误的表达估计,并且哪些是不正确的。我们还表明,通过限制所考虑的跨越边界的读数,可以通过限制所考虑的读取的读取数据中具有偏置表达估计的基因的差异表达分析。生成的方法是在https://github.com/mikpom/uslcount上提供的包。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号