首页> 外文学位 >Kernel machine methods for analysis of genomic data from different sources.
【24h】

Kernel machine methods for analysis of genomic data from different sources.

机译:用于分析来自不同来源的基因组数据的内核机器方法。

获取原文
获取原文并翻译 | 示例

摘要

Comprehensive understanding of complex trait etiology requires examination of multiple sources of genomic variability. Recent advances in high-throughput biotechnology, especially sequencing technology, have enabled multiple platform genomic profile of biological samples. In this dissertation, we consider using the kernel machine regression (KMR) framework to analyze data from different genetic data sources.;In the first part of this dissertation, we develop a new strategy for identification of large scale, global changes in methylation that are associated with environmental variables or clinical outcomes via a functional regression approach. The density or the cumulative distribution function of the methylation values for each individual can be approximated using B-spline basis functions with the spline coefficients to summarize the individual's overall methylation profile. A variance component score test is proposed to test for association between the overall distribution and a continuous or dichotomous outcome and applied to two real studies.;In the second part, we construct a microbiome regression-based kernel association test (MiRKAT) for testing the association between microbial community profiles and a continuous or dichotomous variable of interest such as an environmental exposure or disease status. This method regresses the outcome on the covariates (including potential confounders) and the microbiome compositional profiles through kernel functions. We demonstrate the improved control of type I error and superior power of MiRKAT compared to existing methods through simulations and real studies.;In the final part, we focus on integrative analysis of genome wide association studies (GWAS) and methylation studies. We propose to use the KMR for first testing the cumulative genetic/epigenetic effect on a trait and for subsequent mediation analysis to understand the mechanisms by which the genomic data influence the trait. In particular, we develop an approach that works at the gene level (to allow for a common analysis unit across data types). We compare pair-wise similarity in trait values between individuals to pair-wise similarity in methylation and genotype values, with correspondence suggestive of association. For a significant gene, we develop a causal steps approach to mediation analysis which enables elucidation of the manner in which the different data types work, or do not work, together.
机译:对复杂性状病因的全面了解需要检查基因组变异性的多种来源。高通量生物技术,尤其是测序技术的最新进展,使生物样品的多平台基因组概况成为可能。在本文中,我们考虑使用核机器回归(KMR)框架来分析来自不同遗传数据源的数据。在本文的第一部分中,我们开发了一种新的策略,用于识别大规模的,全球性的甲基化变化。通过功能回归方法与环境变量或临床结果相关联。每个人的甲基化值的密度或累积分布函数可以使用B样条基函数和样条系数来近似,以总结出个体的总体甲基化曲线。提出了方差成分评分测试来检验总体分布与连续或二分结果之间的关联,并将其应用于两个真实的研究。第二部分,我们构建了基于微生物组回归的核关联测试(MiRKAT),以测试微生物群落特征与感兴趣的连续或二分变量(例如环境暴露或疾病状况)之间的关联。该方法通过核函数对协变量(包括潜在的混杂因素)和微生物组组成进行回归分析。通过仿真和真实研究,我们证明了与现有方法相比,改进的MiRKAT I型错误控制能力和优越的控制能力。最后,我们重点研究了全基因组关联研究(GWAS)和甲基化研究的综合分析。我们建议使用KMR首先测试对性状的累积遗传/表观遗传效应,并进行后续的中介分析,以了解基因组数据影响性状的机制。特别是,我们开发了一种在基因水平上起作用的方法(以允许跨数据类型使用共同的分析单位)。我们比较个体之间的特征值的成对相似性与甲基化和基因型值的成对相似性,并暗示关联。对于重要的基因,我们开发了一种介因分析方法来进行调解分析,从而阐明了不同数据类型一起工作或不一起工作的方式。

著录项

  • 作者

    Zhao, Ni.;

  • 作者单位

    The University of North Carolina at Chapel Hill.;

  • 授予单位 The University of North Carolina at Chapel Hill.;
  • 学科 Biology Biostatistics.
  • 学位 Ph.D.
  • 年度 2014
  • 页码 137 p.
  • 总页数 137
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号