首页> 外文学位 >Statistical models for RNA biology: from single nucleotides to single cells.
【24h】

Statistical models for RNA biology: from single nucleotides to single cells.

机译:RNA生物学的统计模型:从单个核苷酸到单个细胞。

获取原文
获取原文并翻译 | 示例

摘要

With the advent of RNA sequencing and other high-throughput molecular assays, RNA biology has recently transitioned from careful curation of single-hypothesis experiments to data-driven design of multi-hypothesis investigations. Fortunately, statistical advances and increasingly powerful computers have given rise to machine learning, a computational framework which can automatically distill perpetually growing datasets into predictive models of fundamental cellular and disease processes. Finally, recent advances in microfluidics have enabled the efficient capture and interrogation of individual cells by a variety of molecular assays. My research bridges theses fields by introducing predictive statistical models of RNA abundance and processing in single cells to uncover new insights into the regulation of RNA editing and splicing and their effects on cellular differentiation.;This dissertation collects my contributions in single-cell and statistical genomics, from low-level details of data analysis to high-level principles of cellular identity and diversity. My early contributions concentrate on building error models of RNA sequencing data in order to extract biologically-relevant signals from experimental noise and sampling biases inherent in high-throughput sequencing technologies. Specifically, I describe statistical models of RNA splicing and editing that are robust to noise from PCR duplicates or sequencing errors and to uneven sampling from incomplete reverse transcription or cDNA fragmentation biases. I then evaluate the models' self-consistency and compare their accuracy relative to a gold standard. With a solid statistical foundation for sequencing data analysis established, my latest contributions focus on developing principled methods of constructing and evaluating compelling biological hypotheses in collaboration with domain experts. Specifically, I describe a Bayesian model of A-to-I RNA editing whose high specificity helped resolve the functional difference between the catalytically-active RNA binding protein ADR-2, and its inactive homolog ADR-1. In another collaboration, I used machine learning to resolve a long-standing question in immunology regarding the asymmetric specification of T cells into two functionally distinct lineages. Here, through one of the first applications of single-cell gene expression analysis of the immune system, I demonstrate that pathogen-activated T cells undergo an early bifurcation into effector- and memory-fated populations and help identify the genes whose asymmetric expression drive this phenomenon. Together all of these contributions establish a principled statistical framework for experimental design and analysis which integrates both hypothesis- and data-driven models to validate new findings and uncover novel principles of RNA biology.
机译:随着RNA测序和其他高通量分子分析技术的出现,RNA生物学最近已从精心策划的单一假设实验过渡到数据驱动的多重假设研究设计。幸运的是,统计技术的进步和功能日益强大的计算机已引发了机器学习,这是一个计算框架,可以自动将不断增长的数据集提炼成基本细胞和疾病过程的预测模型。最后,微流控技术的最新进展使得能够通过多种分子测定法有效捕获和询问单个细胞。我的研究通过引入预测单细胞中RNA丰度和加工的统计模型来跨越这些领域,以揭示有关RNA编辑和剪接及其对细胞分化的影响的新见解。;本论文收集了我在单细胞和统计基因组学中的贡献。 ,从数据分析的低级详细信息到细胞身份和多样性的高级原理。我的早期贡献集中于建立RNA测序数据的错误模型,以便从实验噪声和高通量测序技术固有的采样偏差中提取与生物相关的信号。具体来说,我描述了RNA剪接和编辑的统计模型,这些模型对于PCR重复或测序错误产生的噪声以及由于不完整的逆转录或cDNA片段化产生的不均匀采样具有鲁棒性。然后,我评估模型的自洽性,并比较其相对于黄金标准的准确性。凭借为测序数据分析奠定坚实的统计基础,我的最新贡献致力于与领域专家合作,开发构建和评估令人信服的生物学假设的原则方法。具体来说,我描述了一种A至I RNA编辑的贝叶斯模型,其高特异性有助于解决催化活性RNA结合蛋白ADR-2和其无活性同源ADR-1之间的功能差异。在另一项合作中,我使用机器学习解决了免疫学领域一个长期存在的问题,该问题涉及T细胞的不对称特性分为两个功能不同的谱系。在这里,通过对免疫系统进行单细胞基因表达分析的首批应用之一,我证明了病原体激活的T细胞会早期分叉进入效应子和记忆力强的群体,并帮助鉴定其不对称表达驱动这一过程的基因。现象。所有这些贡献共同为实验设计和分析建立了一个有原则的统计框架,该框架整合了假设和数据驱动模型,以验证新发现并揭示RNA生物学的新原理。

著录项

  • 作者

    Kakaradov, Boyko.;

  • 作者单位

    University of California, San Diego.;

  • 授予单位 University of California, San Diego.;
  • 学科 Biology Bioinformatics.;Statistics.;Biology Molecular.
  • 学位 Ph.D.
  • 年度 2014
  • 页码 120 p.
  • 总页数 120
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号