首页> 美国卫生研究院文献>BMC Bioinformatics >Population genetic analysis of bi-allelic structural variants from low-coverage sequence data with an expectation-maximization algorithm
【2h】

Population genetic analysis of bi-allelic structural variants from low-coverage sequence data with an expectation-maximization algorithm

机译:利用期望最大化算法从低覆盖率序列数据中进行双等位基因结构变异的群体遗传分析

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

BackgroundPopulation genetics and association studies usually rely on a set of known variable sites that are then genotyped in subsequent samples, because it is easier to genotype than to discover the variation. This is also true for structural variation detected from sequence data. However, the genotypes at known variable sites can only be inferred with uncertainty from low coverage data. Thus, statistical approaches that infer genotype likelihoods, test hypotheses, and estimate population parameters without requiring accurate genotypes are becoming popular. Unfortunately, the current implementations of these methods are intended to analyse only single nucleotide and short indel variation, and they usually assume that the two alleles in a heterozygous individual are sampled with equal probability. This is generally false for structural variants detected with paired ends or split reads. Therefore, the population genetics of structural variants cannot be studied, unless a painstaking and potentially biased genotyping is performed first.
机译:背景人口遗传学和关联研究通常依赖于一组已知的可变位点,然后在后续样本中进行基因分型,因为基因分型比发现变异更容易。从序列数据检测到的结构变异也是如此。但是,只能根据低覆盖率数据的不确定性来推断已知可变位点的基因型。因此,在不需要准确的基因型的情况下推断出基因型可能性,检验假设并估计种群参数的统计方法正变得越来越流行。不幸的是,这些方法的当前实现旨在仅分析单个核苷酸和较短的indel变异,并且它们通常假定杂合子个体中的两个等位基因以相同的概率采样。对于检测到具有成对末端或分开阅读的结构变体,这通常是错误的。因此,除非首先进行了艰苦的和可能有偏见的基因分型,否则无法研究结构变异的群体遗传学。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号