首页> 外文学位 >Tools for genetic data management and strategies for optimized imputation of missing genotypes.
【24h】

Tools for genetic data management and strategies for optimized imputation of missing genotypes.

机译:遗传数据管理工具和优化估算缺失基因型的策略。

获取原文
获取原文并翻译 | 示例

摘要

This dissertation includes two main areas of research. The first focuses on the design and development of a genetic study data management and analysis system that aims to ease the burden of dealing with the very large amounts of genetic linkage and association study data from high throughput genotyping platforms and to facilitate the integration of data from multiple sources. The Genetic Study Database (GSD) system is designed to provide security in data transmission and user management, flexibility in study data management and simplicity in user interface operations.;The second area of research focuses on the imputation of inherited genetic polymorphisms or rare variants. Since 2001, with the advent of high throughput sequencing technologies, the cost of sequencing an entire human genome has dropped from 100 million dollars to less than five thousand dollars per genome. Nevertheless, it is still too costly to obtain whole genome sequencing data for every individual in a research study involving thousands of subjects. Genotype imputation, also called in-silico genotyping, is a cost-effective and efficient way to maximize genome coverage in an association study for little or no additional cost. Depending on the type of genetic study, there are two approaches for doing genotype imputation: population-based and family-based. Both are covered in the research reported here.;The population-based approach takes advantage of publicly available genotype reference panels in predicting genotypes of unobserved variants among unrelated individuals. Here, the focus will be on optimizing the post-imputation filtering strategy to find the appropriate balance in the tradeoff between accuracy and the yield of the imputation process (i.e., maximize the number of genotypes imputed). The family-based approach leverages the rich information available in a pedigree to increase power for imputing genotypes of unobserved variants among biological relatives. When performing family-based imputation, it is important to decide how many family members and which family members to select for high density variant genotyping. Their data will be used to predict genotypes of other family members. Therefore, one aim of this part of the research will be to evaluate different family-based imputation designs to identify cost-effective strategies.;This dissertation includes three chapters: 1) designing and building a sophisticated web-based genetic study data management system, 2) identifying an optimized set of genotype/SNP filters for population-based imputation, and 3) discovering the most efficient family-based imputation strategies for various pedigree structures.
机译:本文主要包括两个方面的研究。第一个重点是基因研究数据管理和分析系统的设计和开发,该系统旨在减轻处理来自高通量基因分型平台的大量遗传连锁和关联研究数据的负担,并促进来自多个来源。遗传研究数据库(GSD)系统旨在提供数据传输和用户管理的安全性,研究数据管理的灵活性以及用户界面操作的简便性。第二个研究领域集中在遗传遗传多态性或稀有变异的归因上。自2001年以来,随着高通量测序技术的出现,对整个人类基因组进行测序的成本已从每个基因组的1亿美元降至不到5000美元。然而,在涉及成千上万个受试者的研究中,获取每个人的全基因组测序数据仍然太昂贵。基因型插补,也称为计算机内基因型分型,是在关联研究中以很少或没有额外成本的方式最大化基因组覆盖范围的一种经济高效的方法。根据基因研究的类型,有两种进行基因型估算的方法:基于人群和基于家庭。两者均在此处报道的研究中涵盖。基于人群的方法利用可公开获得的基因型参考面板来预测无关个体之间未观察到的变异的基因型。在此,重点将放在优化输入后过滤策略上,以在插补过程的准确性和产量之间找到适当的平衡(即最大化插补的基因型数量)。基于家庭的方法利用谱系中可用的丰富信息来增强估算亲属中未观察到的变异的基因型的能力。在执行基于家庭的估算时,重要的是确定要为高密度变异基因分型选择多少个家庭成员以及选择哪个家庭成员。他们的数据将用于预测其他家庭成员的基因型。因此,本部分研究的目的是评估不同的基于家族的归因设计,以找出具有成本效益的策略。本论文包括三章:1)设计和构建基于网络的复杂遗传研究数据管理系统, 2)确定用于基于人群的归因的一组优化的基因型/ SNP过滤器,以及3)发现各种谱系结构最有效的基于家庭的归因策略。

著录项

  • 作者

    Kuo, Fengshen.;

  • 作者单位

    Rutgers The State University of New Jersey, School of Health Related Professions.;

  • 授予单位 Rutgers The State University of New Jersey, School of Health Related Professions.;
  • 学科 Bioinformatics.;Genetics.;Computer science.
  • 学位 Ph.D.
  • 年度 2014
  • 页码 152 p.
  • 总页数 152
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号