首页> 外文会议>Research in computational molecular biology >The Clark Phase-able Sample Size Problem: Long-Range Phasing and Loss of Heterozygosity in GWAS
【24h】

The Clark Phase-able Sample Size Problem: Long-Range Phasing and Loss of Heterozygosity in GWAS

机译:Clark相可行的样本量问题:GWAS中的远期定相和杂合性损失

获取原文
获取原文并翻译 | 示例

摘要

A phase transition is taking place today. The amount of data generated by genome resequencing technologies is so large that in some cases it is now less expensive to repeat the experiment than to store the information generated by the experiment. In the next few years it is quite possible that millions of Americans will have been genotyped. The question then arises of how to make the best use of this information and jointly estimate the haplotypes of all these individuals. The premise of the paper is that long shared genomic regions (or tracts) are unlikely unless the haplotypes are identical by descent (IBD), in contrast to short shared tracts which may be identical by state (IBS). Here we estimate for populations, using the US as a model, what sample size of genotyped individuals would be necessary to have sufficiently long shared haplotype regions (tracts) that are identical by descent (IBD), at a statistically significant level. These tracts can then be used as input for a Clark-like phasing method to obtain a complete phasing solution of the sample. We estimate in this paper that for a population like the US and about 1% of the people genotyped (approximately 2 million), tracts of about 200 SNPs long are shared between pairs of individuals IBD with high probability which assures the Clark method phasing success. We show on simulated data that the algorithm will get an almost perfect solution if the number of individuals being SNP arrayed is large enough and the correctness of the algorithm grows with the number of individuals being genotyped. We also study a related problem that connects copy number variation with phasing algorithm success. A loss of heterozygosity (LOH) event is when, by the laws of Mendelian inheritance, an individual should be heterozygote but, due to a deletion polymorphism, is not. Such polymorphisms are difficult to detect using existing algorithms, but play an important role in the genetics of disease and will confuse haplotype phasing algorithms if not accounted for. We will present an algorithm for detecting LOH regions across the genomes of thousands of individuals. The design of the long-range phasing algorithm and the Loss of Het-erozygosity inference algorithms was inspired by analyzing of the Multiple Sclerosis (MS) GWAS dataset of the International Multiple Sclerosis Consortium and we present in this paper similar results with those obtained from the MS data.
机译:今天正在进行阶段过渡。基因组重测序技术产生的数据量如此之大,以至于在某些情况下,现在重复进行实验要比存储实验产生的信息便宜。在未来几年中,很可能会有成千上万的美国人进行基因分型。于是就出现了一个问题,即如何充分利用这些信息并共同估算所有这些个体的单倍型。本文的前提是,除非单倍型由后代相同(IBD),否则长共有的基因组区域(或片段)是不可能的,而州与短共有的基因组区域可能相同。在这里,我们以美国为模型对人口进行估算,以具有统计学上显着水平的基因型个体的样本大小,才能具有足够长的血统(IBD)相同的相同的单倍型区域(区域)。然后可以将这些区域用作克拉克式定相方法的输入,以获得样品的完整定相溶液。我们在本文中估计,对于像美国这样的人口和大约1%的基因型人群(大约200万),IBD个体对之间有大约200个SNP的区域很可能共享,这确保了Clark方法的定相成功。我们在模拟数据上显示,如果被SNP排列的个体数量足够大,并且该算法的正确性随着被基因分型的个体数量的增长而增长,则该算法将获得几乎完美的解决方案。我们还研究了一个相关的问题,该问题将拷贝数变异与定相算法的成功联系起来。杂合子丢失(LOH)事件是指根据孟德尔遗传定律,个体应为杂合子,但由于缺失多态性而并非如此。使用现有算法很难检测到这种多态性,但在疾病的遗传学中起着重要作用,如果不加以考虑,将会使单倍型定相算法感到困惑。我们将提出一种算法,用于检测数千个人基因组中的LOH区域。通过对国际多发性硬化症协会的多发性硬化症(MS)GWAS数据集进行分析,启发了远距离定相算法的设计和杂合性丢失推断算法的设计,我们在本文中提出的结果与从多发性硬化症联盟获得的结果相似。 MS数据。

著录项

  • 来源
  • 会议地点 Lisbon(PT);Lisbon(PT);Lisbon(PT)
  • 作者单位

    Center for Computational Molecular Biology, Brown University,School of Science and Engineering, Reykjavik University,deCODE genetics;

    Center for Computational Molecular Biology, Brown University,Department of Computer Science, Brown University;

    Center for Computational Molecular Biology, Brown University,Department of Computer Science, Brown University;

    Center for Computational Molecular Biology, Brown University,Department of Computer Science, Brown University;

  • 会议组织
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 生物工程学(生物技术);
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号