首页> 美国卫生研究院文献>Nucleic Acids Research >HapFABIA: Identification of very short segments of identity by descent characterized by rare variants in large sequencing data
【2h】

HapFABIA: Identification of very short segments of identity by descent characterized by rare variants in large sequencing data

机译:HapFABIA:通过血统鉴定非常短的同一部分其特征是大型测序数据中的罕见变异

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Identity by descent (IBD) can be reliably detected for long shared DNA segments, which are found in related individuals. However, many studies contain cohorts of unrelated individuals that share only short IBD segments. New sequencing technologies facilitate identification of short IBD segments through rare variants, which convey more information on IBD than common variants. Current IBD detection methods, however, are not designed to use rare variants for the detection of short IBD segments. Short IBD segments reveal genetic structures at high resolution. Therefore, they can help to improve imputation and phasing, to increase genotyping accuracy for low-coverage sequencing and to increase the power of association studies. Since short IBD segments are further assumed to be old, they can shed light on the evolutionary history of humans. We propose HapFABIA, a computational method that applies biclustering to identify very short IBD segments characterized by rare variants. HapFABIA is designed to detect short IBD segments in genotype data that were obtained from next-generation sequencing, but can also be applied to DNA microarray data. Especially in next-generation sequencing data, HapFABIA exploits rare variants for IBD detection. HapFABIA significantly outperformed competing algorithms at detecting short IBD segments on artificial and simulated data with rare variants. HapFABIA identified 160 588 different short IBD segments characterized by rare variants with a median length of 23 kb (mean 24 kb) in data for chromosome 1 of the 1000 Genomes Project. These short IBD segments contain 752 000 single nucleotide variants (SNVs), which account for 39% of the rare variants and 23.5% of all variants. The vast majority—152 000 IBD segments—are shared by Africans, while only 19 000 and 11 000 are shared by Europeans and Asians, respectively. IBD segments that match the Denisova or the Neandertal genome are found significantly more often in Asians and Europeans but also, in some cases exclusively, in Africans. The lengths of IBD segments and their sharing between continental populations indicate that many short IBD segments from chromosome 1 existed before humans migrated out of Africa. Thus, rare variants that tag these short IBD segments predate human migration from Africa. The software package HapFABIA is available from Bioconductor. All data sets, result files and programs for data simulation, preprocessing and evaluation are supplied at .
机译:可以可靠地检测出在相关个体中发现的长期共享DNA片段的后裔身份(IBD)。但是,许多研究包含不相关个体的队列,这些个体仅共享较短的IBD片段。新的测序技术有助于通过稀有变体鉴定短的IBD片段,与常见变体相比,IBD传达的IBD信息更多。但是,当前的IBD检测方法并未设计为使用稀有变异体来检测短IBD片段。 IBD短片段揭示了高分辨率的遗传结构。因此,它们可以帮助改善归因和定相,提高低覆盖率测序的基因分型准确性,并提高关联研究的能力。由于较短的IBD片段被认为是古老的,因此它们可以阐明人类的进化史。我们提出了HapFABIA,这是一种计算方法,该方法应用双聚类分析来识别以稀有变异为特征的非常短的IBD片段。 HapFABIA旨在检测从下一代测序获得的基因型数据中的短IBD片段,但也可以应用于DNA微阵列数据。特别是在下一代测序数据中,HapFABIA利用稀有变体进行IBD检测。 HapFABIA在检测带有稀有变体的人工和模拟数据上的IBD短片段时,明显优于竞争算法。 HapFABIA在1000个基因组计划的1号染色体数据中鉴定出160 588个不同的短IBD片段,其特征是罕见变体,中位长度为23 kb(平均24 kb)。这些短的IBD区段包含75.2万个单核苷酸变体(SNV),其占稀有变体的39%和所有变体的23.5%。绝大多数人(15.2万个IBD人群)由非洲人共享,而欧洲人和亚洲人分别只有1.9万和11000人。在亚洲人和欧洲人中,发现与Denisova或尼安德特人基因组相匹配的IBD片段的频率更高,在某些情况下,在非洲人中也是如此。 IBD片段的长度及其在大陆种群之间的共有性表明,在人类移出非洲之前,存在许多来自1号染色体的IBD短片段。因此,标记这些短IBD片段的罕见变体早于人类从非洲迁徙。 HapFABIA软件包可从Bioconductor获得。的所有数据集,结果文件以及用于数据模拟,预处理和评估的程序均在处提供。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号