首页> 美国卫生研究院文献>Molecular Biology and Evolution >Detecting Genomic Signatures of Natural Selection with Principal Component Analysis: Application to the 1000 Genomes Data
【2h】

Detecting Genomic Signatures of Natural Selection with Principal Component Analysis: Application to the 1000 Genomes Data

机译:用主成分分析检测自然选择的基因组特征:在1000个基因组数据中的应用

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

To characterize natural selection, various analytical methods for detecting candidate genomic regions have been developed. We propose to perform genome-wide scans of natural selection using principal component analysis (PCA). We show that the common FST index of genetic differentiation between populations can be viewed as the proportion of variance explained by the principal components. Considering the correlations between genetic variants and each principal component provides a conceptual framework to detect genetic variants involved in local adaptation without any prior definition of populations. To validate the PCA-based approach, we consider the 1000 Genomes data (phase 1) considering 850 individuals coming from Africa, Asia, and Europe. The number of genetic variants is of the order of 36 millions obtained with a low-coverage sequencing depth (3×). The correlations between genetic variation and each principal component provide well-known targets for positive selection (EDAR, SLC24A5, SLC45A2, DARC), and also new candidate genes (APPBPP2, TP1A1, RTTN, KCNMA, MYO5C) and noncoding RNAs. In addition to identifying genes involved in biological adaptation, we identify two biological pathways involved in polygenic adaptation that are related to the innate immune system (beta defensins) and to lipid metabolism (fatty acid omega oxidation). An additional analysis of European data shows that a genome scan based on PCA retrieves classical examples of local adaptation even when there are no well-defined populations. PCA-based statistics, implemented in the PCAdapt R package and the PCAdapt fast open-source software, retrieve well-known signals of human adaptation, which is encouraging for future whole-genome sequencing project, especially when defining populations is difficult.
机译:为了表征自然选择,已经开发了用于检测候选基因组区域的各种分析方法。我们建议使用主成分分析(PCA)对自然选择进行全基因组扫描。我们表明,群体间遗传分化的共同FST指数可以看作是主要成分所解释的方差比例。考虑到遗传变异与每个主要组成部分之间的相关性,提供了一个概念框架,可用于检测无需任何事先定义的种群就可以进行局部适应的遗传变异。为了验证基于PCA的方法,我们考虑了来自非洲,亚洲和欧洲的850位个体的1000个基因组数据(第一阶段)。以低覆盖率的测序深度(3倍)获得的遗传变异数量约为3600万。遗传变异与每个主要成分之间的相关性为阳性选择(EDAR,SLC24A5,SLC45A2,DARC)以及新的候选基因(APPBPP2,TP1A1,RTTN,KCNMA,MYO5C)和非编码RNA提供了众所周知的靶标。除了确定参与生物适应的基因外,我们还确定了与先天免疫系统(β防御素)和脂质代谢(脂肪酸ω氧化)相关的多基因适应中涉及的两个生物学途径。欧洲数据的另一项分析表明,即使没有明确定义的种群,基于PCA的基因组扫描也能检索到局部适应的经典实例。在PCAdapt R软件包和PCAdapt快速开源软件中实施的基于PCA的统计信息,检索到人类适应的著名信号,这对于未来的全基因组测序项目尤其是在定义种群困难时尤其令人鼓舞。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号