首页> 外文期刊>BMC Bioinformatics >Genotype calling in tetraploid species from bi-allelic marker data using mixture models
【24h】

Genotype calling in tetraploid species from bi-allelic marker data using mixture models

机译:使用混合模型从双等位基因标记数据中调用四倍体物种的基因型

获取原文
           

摘要

Background Automated genotype calling in tetraploid species was until recently not possible, which hampered genetic analysis. Modern genotyping assays often produce two signals, one for each allele of a bi-allelic marker. While ample software is available to obtain genotypes (homozygous for either allele, or heterozygous) for diploid species from these signals, such software is not available for tetraploid species which may be scored as five alternative genotypes (aaaa, baaa, bbaa, bbba and bbbb; nulliplex to quadruplex). Results We present a novel algorithm, implemented in the R package fitTetra, to assign genotypes for bi-allelic markers to tetraploid samples from genotyping assays that produce intensity signals for both alleles. The algorithm is based on the fitting of several mixture models with five components, one for each of the five possible genotypes. The models have different numbers of parameters specifying the relation between the five component means, and some of them impose a constraint on the mixing proportions to conform to Hardy-Weinberg equilibrium (HWE) ratios. The software rejects markers that do not allow a reliable genotyping for the majority of the samples, and it assigns a missing score to samples that cannot be scored into one of the five possible genotypes with sufficient confidence. Conclusions We have validated the software with data of a collection of 224 potato varieties assayed with an Illumina GoldenGate? 384 SNP array and shown that all SNPs with informative ratio distributions are fitted. Almost all fitted models appear to be correct based on visual inspection and comparison with diploid samples. When the collection of potato varieties is analyzed as if it were a population, almost all markers seem to be in Hardy-Weinberg equilibrium. The R package fitTetra is freely available under the GNU Public License from http://www.plantbreeding.wur.nl/UK/software_fitTetra.html webcite and as Additional files with this article.
机译:背景技术直到最近才不可能在四倍体物种中进行自动基因型检测,这阻碍了基因分析。现代基因分型测定法通常会产生两个信号,一个信号用于双等位基因标记的每个等位基因。尽管有足够的软件可从这些信号中获取二倍体物种的基因型(等位基因为纯合子,或杂合子),但该软件不适用于四倍体物种,可将其计为五种替代基因型(aaaa,baaa,bbaa,bbba和bbbb) ; nulliplex到quadruplex)。结果我们提出了一种在R包fitTetra中实现的新颖算法,用于将双等位基因标记的基因型分配给来自产生两个等位基因强度信号的基因型分析的四倍体样品。该算法基于具有5个成分的几种混合模型的拟合,其中5种可能的基因型各有一种。这些模型具有不同数量的参数,这些参数指定了五种均值之间的关系,其中一些参数对混合比例施加了约束,以使其符合Hardy-Weinberg平衡(HWE)比率。该软件会拒绝不允许对大多数样本进行可靠的基因分型的标记,并且会为无法以足够的可信度将其评分为五种可能基因型之一的样本分配缺失分数。结论我们已经用Illumina GoldenGate检测的224个马铃薯品种的数据验证了该软件。 384 SNP阵列,并显示所有具有有益比率分布的SNP均已拟合。根据目测和与二倍体样品的比较,几乎所有拟合的模型似乎都是正确的。在分析马铃薯品种的收集情况时,就好像它是一个种群,几乎所有标记物似乎都处于Hardy-Weinberg平衡状态。根据GNU公共许可证,可以从http://www.plantbreeding.wur.nl/UK/software_fitTetra.html网站上免费获得R包fitTetra,并可以作为本文的附加文件使用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号