首页> 美国卫生研究院文献>Frontiers in Genetics >Population Stratification in the Context of Diverse Epidemiologic Surveys Sans Genome-Wide Data
【2h】

Population Stratification in the Context of Diverse Epidemiologic Surveys Sans Genome-Wide Data

机译:在不同的流行病学调查背景下的人群分层

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Population stratification or confounding by genetic ancestry is a potential cause of false associations in genetic association studies. Estimation of and adjustment for genetic ancestry has become common practice thanks in part to the availability of ancestry informative markers on genome-wide association study (GWAS) arrays. While array data is now widespread, these data are not ubiquitous as several large epidemiologic and clinic-based studies lack genome-wide data. One such large epidemiologic-based study lacking genome-wide data accessible to investigators is the National Health and Nutrition Examination Surveys (NHANES), population-based cross-sectional surveys of Americans linked to demographic, health, and lifestyle data conducted by the Centers for Disease Control and Prevention. DNA samples (n = 14,998) were extracted from biospecimens from consented NHANES participants between 1991–1994 (NHANES III, phase 2) and 1999–2002 and represent three major self-identified racial/ethnic groups: non-Hispanic whites (n = 6,634), non-Hispanic blacks (n = 3,458), and Mexican Americans (n = 3,950). We as the Epidemiologic Architecture for Genes Linked to Environment study genotyped candidate gene and GWAS-identified index variants in NHANES as part of the larger Population Architecture using Genomics and Epidemiology I study for collaborative genetic association studies. To enable basic quality control such as estimation of genetic ancestry to control for population stratification in NHANES san genome-wide data, we outline here strategies that use limited genetic data to identify the markers optimal for characterizing genetic ancestry. From among 411 and 295 autosomal SNPs available in NHANES III and NHANES 1999–2002, we demonstrate that markers with ancestry information can be identified to estimate global ancestry. Despite limited resolution, global genetic ancestry is highly correlated with self-identified race for the majority of participants, although less so for ethnicity. Overall, the strategies outlined here for a large epidemiologic study can be applied to other datasets accessible for genotype–phenotype studies but are sans genome-wide data.
机译:遗传背景下的人口分层或混淆是遗传关联研究中错误关联的潜在原因。遗传祖先的估计和调整已成为普遍做法,部分原因是在全基因组关联研究(GWAS)阵列上可以使用祖先信息标记。尽管阵列数据现在很普遍,但是由于一些大型流行病学和基于临床的研究缺乏全基因组数据,因此这些数据并不普遍。此类基于流行病学的大型研究缺乏研究者可访问的全基因组数据,这是美国国家卫生和营养检查调查(NHANES),由美国人口与健康研究中心进行的与人口,健康和生活方式数据相关的基于美国人的横断面调查疾病控制与预防。 DNA样本(n = 14,998)是从1991–1994(NHANES III,第2阶段)至1999–2002年间,从同意的NHANES参与者的生物样本中提取的,代表了三个主要的自我认同的种族/族裔群体:非西班牙裔白人(n = 6,634) ),非西班牙裔黑人(n = 3,458)和墨西哥裔美国人(n = 3,950)。我们作为与环境相关的基因的流行病学体系,研究了NHANES中基因型候选基因和GWAS鉴定的索引变体,这是使用基因组学和流行病学进行大规模人群构建的一部分。为了实现基本的质量控制,例如在NHANES全基因组范围的数据中控制遗传祖先以控制种群分层,我们在此概述了使用有限的遗传数据来识别表征遗传祖先的最佳标记的策略。从NHANES III和NHANES 1999–2002中可获得的411和295个常染色体SNP中,我们证明了可以鉴定带有血统信息的标记来估计全球血统。尽管分辨率有限,但对于大多数参与者而言,全球遗传血统与自我认同的种族高度相关,尽管对于种族而言却不那么重要。总体而言,此处概述的用于大型流行病学研究的策略可以应用于可用于基因型-表型研究的其他数据集,但没有全基因组数据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号