首页> 外文学位 >Computational analysis of RNA-Seq data in the absence of a known genome.
【24h】

Computational analysis of RNA-Seq data in the absence of a known genome.

机译:在没有已知基因组的情况下,RNA-Seq数据的计算分析。

获取原文
获取原文并翻译 | 示例

摘要

RNA-Seq technology has revolutionized the way we study transcriptomes. In particular, it has enabled us to investigate the transcriptomes of species that have not yet had their genomes sequenced. This thesis focuses on two computational tasks that are crucial to analyzing RNA-Seq data in the absence of a sequenced genome: transcript quantification and de novo transcriptome assembly evaluation.;For transcript quantification, RNA-Seq is considered a more accurate replacement for microarrays. However, to allow for the highest accuracy, methods for analyzing RNA-Seq data must address the challenge of handling reads that map to multiple genes or isoforms. We present RSEM, a generative statistical model of the sequencing process and associated inference methods, which tackles this challenge in a principled manner. Our results on both simulated and real data sets suggest that RSEM has superior or comparable performance to other quantification methods developed at the same time.;To facilitate the usage of our method, we implement RSEM as a robust and user-friendly software package for quantifying gene and isoform abundances from single-end or paired-end RNA-Seq data. RSEM outputs abundance estimates, 95% credibility intervals, and visualization files and can also simulate RNA-Seq data. In contrast to other existing tools, the software does not require a reference genome. Thus, in combination with a de novo transcriptome assembler, RSEM enables accurate transcript quantification for species without sequenced genomes.;Building off of RSEM, we have developed a novel probabilistic model based method, RSEM-EVAL, for evaluating de novo transcriptome assemblies from RNA-Seq data without the ground truth. Our RSEM-EVAL score has a broad range of potential applications, such as selecting assemblers, optimizing parameters for an assembler and guiding new assembler design. Results on both simulated and real data sets show that the RSEM-EVAL score correctly reflects the accuracies of the assemblies. To demonstrate its usage, we assembled the transcriptome of the regenerating axolotl limb by selecting among over 100 candidate assemblies based on their RSEM-EVAL scores.
机译:RNA-Seq技术彻底改变了我们研究转录组的方式。特别是,它使我们能够研究尚未对其基因组测序的物种的转录组。本文着重于两个计算任务,这些任务对于在没有测序基因组的情况下分析RNA-Seq数据至关重要:转录本定量和从头转录组装配评估。;对于转录本定量,RNA-Seq被认为是微阵列的更准确替代品。但是,为了获得最高的准确性,用于分析RNA-Seq数据的方法必须解决处理映射到多个基因或同工型的读段的挑战。我们提出RSEM,这是测序过程和相关推论方法的生成统计模型,它以有原则的方式解决了这一挑战。我们在模拟和真实数据集上的结果表明RSEM的性能优于同时开发的其他定量方法。;为便于使用我们的方法,我们将RSEM实施为可靠且用户友好的定量软件包从单端或双端RNA-Seq数据获取基因和同工型丰度。 RSEM输出丰度估算值,95%的可信度区间和可视化文件,还可以模拟RNA-Seq数据。与其他现有工具相比,该软件不需要参考基因组。因此,结合使用从头转录组组装器,RSEM可以对没有序列基因组的物种进行准确的转录本定量。基于RSEM,我们开发了一种基于概率模型的新方法RSEM-EVAL,用于评估RNA的从头转录组组装-Seq数据,没有基础事实。我们的RSEM-EVAL分数具有广泛的潜在应用,例如选择组装器,为组装器优化参数并指导新的组装器设计。在模拟和真实数据集上的结果均显示RSEM-EVAL分数正确反映了组件的精度。为了演示其用法,我们通过基于RSEM-EVAL分数从100多个候选程序集中进行选择,从而组装了再生的腋窝肢体的转录组。

著录项

  • 作者

    Li, Bo.;

  • 作者单位

    The University of Wisconsin - Madison.;

  • 授予单位 The University of Wisconsin - Madison.;
  • 学科 Biology Biostatistics.;Biology Bioinformatics.;Computer Science.
  • 学位 Ph.D.
  • 年度 2013
  • 页码 120 p.
  • 总页数 120
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号