首页> 美国卫生研究院文献>other >Analysis and Optimization of Bulk DNA Sampling with Binary Scoring for Germplasm Characterization
【2h】

Analysis and Optimization of Bulk DNA Sampling with Binary Scoring for Germplasm Characterization

机译:种质特征二元计数批量DNA采样分析与优化

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The strategy of bulk DNA sampling has been a valuable method for studying large numbers of individuals through genetic markers. The application of this strategy for discrimination among germplasm sources was analyzed through information theory, considering the case of polymorphic alleles scored binarily for their presence or absence in DNA pools. We defined the informativeness of a set of marker loci in bulks as the mutual information between genotype and population identity, composed by two terms: diversity and noise. The first term is the entropy of bulk genotypes, whereas the noise term is measured through the conditional entropy of bulk genotypes given germplasm sources. Thus, optimizing marker information implies increasing diversity and reducing noise. Simple formulas were devised to estimate marker information per allele from a set of estimated allele frequencies across populations. As an example, they allowed optimization of bulk size for SSR genotyping in maize, from allele frequencies estimated in a sample of 56 maize populations. It was found that a sample of 30 plants from a random mating population is adequate for maize germplasm SSR characterization. We analyzed the use of divided bulks to overcome the allele dilution problem in DNA pools, and concluded that samples of 30 plants divided into three bulks of 10 plants are efficient to characterize maize germplasm sources through SSR with a good control of the dilution problem. We estimated the informativeness of 30 SSR loci from the estimated allele frequencies in maize populations, and found a wide variation of marker informativeness, which positively correlated with the number of alleles per locus.
机译:大量DNA采样策略已成为通过遗传标记研究大量个体的有价值的方法。通过信息理论,分析了该策略在种质来源之间进行区分的应用,考虑了对DNA池中是否存在的多态性等位基因进行二进制评分的情况。我们将一组标记基因座的信息量定义为基因型和群体身份之间的相互信息,由两个术语组成:多样性和噪声。第一项是整体基因型的熵,而噪声项是通过给定种质来源的整体基因型的条件熵来衡量的。因此,优化标记信息意味着增加分集并减少噪声。设计了简单的公式,以从一组跨人群的估计等位基因频率中估计每个等位基因的标记信息。例如,他们允许根据56个玉米种群样本中估计的等位基因频率来优化玉米SSR基因分型的总体大小。已经发现,来自随机交配种群的30株植物的样品足以用于玉米种质SSR表征。我们分析了使用分离的大块来克服DNA池中等位基因稀释问题的方法,并得出结论,将30株植物的样品分成10株的三大块,可以有效地通过SSR很好地控制稀释问题,从而表征玉米种质来源。我们根据估计的玉米等位基因频率估计了30个SSR基因座的信息量,发现标记信息量的广泛差异,与每个基因座的等位基因数量呈正相关。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号