首页> 外文期刊>Bioinformatics >ORMAN: Optimal resolution of ambiguous RNA-Seq multimappings in the presence of novel isoforms
【24h】

ORMAN: Optimal resolution of ambiguous RNA-Seq multimappings in the presence of novel isoforms

机译:ORMAN:在存在新的同工型时,模棱两可的RNA-Seq多重映射的最佳分辨率

获取原文
获取原文并翻译 | 示例
       

摘要

Motivation: RNA-Seq technology is promising to uncover many novel alternative splicing events, gene fusions and other variations in RNA transcripts. For an accurate detection and quantification of transcripts, it is important to resolve the mapping ambiguity for those RNA-Seq reads that can be mapped to multiple loci: >17% of the reads from mouse RNA-Seq data and 50% of the reads from some plant RNA-Seq data have multiple mapping loci. In this study, we show how to resolve the mapping ambiguity in the presence of novel transcriptomic events such as exon skipping and novel indels towards accurate downstream analysis. We introduce ORMAN (Optimal Resolution of Multimapping Ambiguity of RNA-Seq Reads), which aims to compute the minimum number of potential transcript products for each gene and to assign each multimapping read to one of these transcripts based on the estimated distribution of the region covering the read. ORMAN achieves this objective through a combinatorial optimization formulation, which is solved through wellknown approximation algorithms, integer linear programs and heuristics. Results: On a simulated RNA-Seq dataset including a random subset of transcripts from the UCSC database, the performance of several state-of-the-art methods for identifying and quantifying novel transcripts, such as Cufflinks, IsoLasso and CLIIQ, is significantly improved through the use of ORMAN. Furthermore, in an experiment using real RNA-Seq reads, we show that ORMAN is able to resolve multimapping to produce coverage values that are similar to the original distribution, even in genes with highly non-uniform coverage.
机译:动机:RNA-Seq技术有望揭示许多新颖的可变剪接事件,基因融合以及RNA转录物中的其他变异。为了准确检测和定量转录本,重要的是要解决那些可映射到多个基因座的RNA-Seq读物的映射歧义:小鼠RNA-Seq数据中的> 17%和50%一些植物RNA-Seq数据具有多个定位位点。在这项研究中,我们展示了如何在存在新的转录组事件(例如外显子跳跃和新颖的indel)的情况下解决映射的歧义,以进行准确的下游分析。我们介绍了ORMAN(RNA-Seq读段的多重映射歧义的最佳分辨率),其目的是计算每个基因的潜在转录产物的最小数量,并根据估计的覆盖区域分布将每个多重映射读段分配给这些转录本之一阅读。 ORMAN通过组合优化公式来实现此目标,该组合公式通过众所周知的逼近算法,整数线性程序和启发式方法得以解决。结果:在一个模拟的RNA-Seq数据集上,该数据集包括来自UCSC数据库的转录本的一个随机子集,可显着提高几种用于鉴定和量化新颖转录本的最新方法的性能,例如袖扣,IsoLasso和CLIIQ通过使用ORMAN。此外,在使用真实RNA-Seq读数的实验中,我们表明ORMAN能够解析多重映射以产生与原始分布相似的覆盖率值,即使在覆盖率非常不均匀的基因中也是如此。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号