首页> 外文学位 >Genomic and functional analysis of next-generation sequencing data.
【24h】

Genomic and functional analysis of next-generation sequencing data.

机译:下一代测序数据的基因组和功能分析。

获取原文
获取原文并翻译 | 示例

摘要

Advances in next-generation sequencing (NGS) technologies have resulted in significant reduction of cost per sequenced base pair and increase in sequence data volume. On the other hand, most currently used NGS technologies produce relatively short sequence reads (50 - 150 bp) compared to Sanger sequencing (∼700 bp). This represents an additional challenge in data analysis, because shorter reads are more difficult to assemble. At this point, production of sequencing data outpaces our capacity to analyze them. Newer NGS technologies capable of producing longer reads are emerging, which should simplify and speed up genome assembly. However, this will only increase the number of sequenced genomes without structural and functional annotation. In addition to multiple scientific initiatives to sequence thousands of genomes, personalized medicine centered on sequencing and analysis of individual human genomes will become more available. This poses a challenge for computer science and emphasizes the importance of developing new computational algorithms, methodology, tools, and pipelines. This dissertation focuses on development of these software tools, methodologies, and resources to help address the need for processing of volumes of data generated by new sequencing technologies. The research concentrated on genome structure analysis, individual variation, and comparative biology. This dissertation presents: (1) the Short Read Classification Pipeline (SRCP) for preliminary genome characterization of unsequenced genomes; (2) a novel methodology for phylogenetic analysis of closely related organisms or strains of the same organism without a sequenced genome; (3) a centralized online resource for standardized gene nomenclature. Utilizing the SRCP and the methodology for initial phylogenetic analysis developed in this dissertation enables positioning the organism in the evolutionary context. This should facilitate identification of orthologs between the species and paralogs within the species even in the initial stage of the analysis when only exome is sequenced and, thus, enable functional annotation by transferring gene nomenclature from well-annotated 1:1 orthologs, as required by the online standardized gene nomenclature resource developed in this dissertation. Thus, the tools, methodology, and resources presented here are tied together in following the initial analysis workflow for structural and functional annotation.
机译:下一代测序(NGS)技术的进步已大大降低了每个测序碱基对的成本,并增加了序列数据量。另一方面,与Sanger测序(〜700 bp)相比,大多数当前使用的NGS技术产生的序列读段相对较短(50-150 bp)。这代表了数据分析中的另一个挑战,因为较短的读数更难组装。在这一点上,测序数据的产生超过了我们分析数据的能力。能够产生更长读段的新型NGS技术正在兴起,它将简化并加快基因组组装。但是,这只会增加测序的基因组的数量,而没有结构和功能注释。除了对数以千计的基因组进行测序的多项科学计划之外,以单个人类基因组的测序和分析为中心的个性化医学也将变得更加可用。这对计算机科学构成了挑战,并强调了开发新的计算算法,方法论,工具和管道的重要性。本文致力于这些软件工具,方法和资源的开发,以帮助满足处理新测序技术生成的数据量的需求。该研究集中于基因组结构分析,个体变异和比较生物学。本文提出:(1)短读分类管线(SRCP),用于未测序基因组的初步基因组表征; (2)一种新的方法,用于对没有相关基因组的紧密相关生物或同一生物的菌株进行系统发育分析; (3)标准化基因命名的集中在线资源。利用本文开发的SRCP和方法进行初始系统发育分析,可将生物定位于进化环境中。即使仅在对外显子进行测序时,即使在分析的初始阶段,这也应有助于识别物种之间的直系同源物,从而在物种分析的最初阶段,因此,可以根据需要,通过从注释充分的1:1直系同源物转移基因命名法来进行功能注释。本文开发了在线标准化基因命名资源。因此,遵循结构和功能注释的初始分析工作流程,将此处介绍的工具,方法和资源捆绑在一起。

著录项

  • 作者

    Chouvarine, Philippe.;

  • 作者单位

    Mississippi State University.;

  • 授予单位 Mississippi State University.;
  • 学科 Biology Genetics.;Biology Bioinformatics.
  • 学位 Ph.D.
  • 年度 2012
  • 页码 75 p.
  • 总页数 75
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号