首页> 外文学位 >Discovery and Applications of Bacterial Noncoding RNAs.
【24h】

Discovery and Applications of Bacterial Noncoding RNAs.

机译:细菌非编码RNA的发现和应用。

获取原文
获取原文并翻译 | 示例

摘要

Noncoding RNAs (ncRNAs) are functional transcripts that do not code for proteins. Many of them play indispensible roles in the cell. For example, the ribosomal RNAs make up the ribosome that is the factory for making proteins and riboswitches bind to small metabolites in the cell and regulate gene expression. Computational discovery of ncRNAs is challenging, however, because ncRNAs evolve rapidly on the nucleotide level while preserving secondary structure. In the first part of this thesis, we develop two clustering algorithms that are robust to weak sequence homology signals and are applicable on the genomic scale. We show that both algorithms can recover most known ncRNA families and as few as 5 homologous sequences are needed to predict a strong motif.;In the second part of the thesis, we investigate whether secondary structure in- formation improves maximum likelihood tree inference for ncRNAs. An accurate phylogenetic tree has important biological and clinical applications: it can be used to infer the function of novel organisms and understand the evolutionary history of species. We show that using structure information, a more realistic gap model, and a maximum likelihood approach improves phylogenetic tree inference.;In the third part of the thesis, we develop a method for profiling human gut microbial communities using high-throughput sequencing. Our method works on Illumina short reads and does not require assembly or taxonomic identification. We show that it can differentiate between the gut microbiota of healthy individuals at low sequencing depth, making it a cost-effective screening tool for large population studies.;In the final part of the thesis, we use a standard additions experiment to examine sequencing bias and errors in Illumina HiSeq. We identify features associated with systematic errors and develop an error correction pipeline. We show that our method reduces base errors and produces better species diversity estimates.
机译:非编码RNA(ncRNA)是不编码蛋白质的功能性转录物。他们中的许多人在细胞中起着不可或缺的作用。例如,核糖体RNA组成核糖体,核糖体是使蛋白质和核糖开关与细胞中的小代谢物结合并调节基因表达的工厂。但是,ncRNA的计算发现具有挑战性,因为ncRNA在核苷酸水平上快速进化,同时保留二级结构。在本文的第一部分中,我们开发了两种对弱序列同源性信号具有鲁棒性并且适用于基因组规模的聚类算法。我们证明了这两种算法都可以恢复大多数已知的ncRNA家族,并且仅需5个同源序列就可以预测一个强大的基序。在论文的第二部分,我们研究了二级结构信息是否可以改善ncRNA的最大似然树推论。 。准确的系统发育树具有重要的生物学和临床应用:可用于推断新型生物的功能并了解物种的进化历史。我们表明,使用结构信息,更现实的缺口模型和最大似然方法可以改善系统进化树的推断。;论文的第三部分,我们开发了一种使用高通量测序对人肠道微生物群落进行分析的方法。我们的方法适用于Illumina的短读,不需要组装或分类识别。我们证明了它可以在低测序深度下区分健康个体的肠道菌群,使其成为进行大量人群研究的经济有效的筛选工具。;在论文的最后部分,我们使用标准的加法实验来检验测序偏倚和Illumina HiSeq中的错误。我们确定与系统错误相关的功能,并开发错误纠正管道。我们证明了我们的方法减少了基本误差并产生了更好的物种多样性估计。

著录项

  • 作者

    Tseng, Huei-Hun Elizabeth.;

  • 作者单位

    University of Washington.;

  • 授予单位 University of Washington.;
  • 学科 Biology Microbiology.;Biology Bioinformatics.;Computer Science.
  • 学位 Ph.D.
  • 年度 2012
  • 页码 183 p.
  • 总页数 183
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号