首页> 外文期刊>Bioinformatics >Unbiased probabilistic taxonomic classification for DNA barcoding
【24h】

Unbiased probabilistic taxonomic classification for DNA barcoding

机译:DNA条形码的无偏概率分类学分类

获取原文
获取原文并翻译 | 示例
       

摘要

Motivation: When targeted to a barcoding region, high-throughput sequencing can be used to identify species or operational taxonomical units from environmental samples, and thus to study the diversity and structure of species communities. Although there are many methods which provide confidence scores for assigning taxonomic affiliations, it is not straightforward to translate these values to unbiased probabilities. We present a probabilistic method for taxonomical classification (PROTAX) of DNA sequences. Given a pre-defined taxonomical tree structure that is partially populated by reference sequences, PROTAX decomposes the probability of one to the set of all possible outcomes. PROTAX accounts for species that are present in the taxonomy but that do not have reference sequences, the possibility of unknown taxonomical units, as well as mislabeled reference sequences. PROTAX is based on a statistical multinomial regression model, and it can utilize any kind of sequence similarity measures or the outputs of other classifiers as predictors.
机译:动机:当针对条形码区域时,高通量测序可用于从环境样品中识别物种或操作生物分类单位,从而研究物种群落的多样性和结构。尽管有许多方法可以提供用于分配生物分类隶属关系的置信度分数,但是将这些值转换为无偏概率并非易事。我们为DNA序列的分类学分类(PROTAX)提供了一种概率方法。给定一个预先定义的分类树结构,该结构由参考序列部分填充,PROTAX会将一个概率分解为所有可能结果的集合。 PROTAX解释了分类法中存在但没有参考序列的物种,未知分类学单元的可能性以及标记错误的参考序列。 PROTAX基于统计多项式回归模型,并且可以利用任何种类的序列相似性度量或其他分类器的输出作为预测变量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号