首页> 外文期刊>Applications in plant sciences. >Not that kind of tree: Assessing the potential for decision tree–based plant identification using trait databases
【24h】

Not that kind of tree: Assessing the potential for decision tree–based plant identification using trait databases

机译:不是那种树:评估使用特征数据库的决策树的工厂识别的可能性

获取原文
           

摘要

Premise Advancements in machine learning and the rise of accessible “big data” provide an important opportunity to improve trait‐based plant identification. Here, we applied decision‐tree induction to a subset of data from the TRY plant trait database to (1) assess the potential of decision trees for plant identification and (2) determine informative traits for distinguishing taxa. Methods Decision trees were induced using 16 vegetative and floral traits (689 species, 20 genera). We assessed how well the algorithm classified species from test data and pinpointed those traits that were important for identification across diverse taxa. Results The unpruned tree correctly placed 98% of the species in our data set into genera, indicating its promise for distinguishing among the species used to construct them. Furthermore, in the pruned tree, an average of 89% of the species from the test data sets were properly classified into their genera, demonstrating the flexibility of decision trees to also classify new species into genera within the tree. Closer inspection revealed that seven of the 16 traits were sufficient for the classification, and these traits yielded approximately two times more initial information gain than those not included. Discussion Our findings demonstrate the potential for tree‐based machine learning and big data in distinguishing among taxa and determining which traits are important for plant identification.
机译:机器学习的前提提进展和可访问的“大数据”的兴起提供了改善基于特质的工厂识别的重要机会。在这里,我们将决策树诱导从Try Plant Trait数据库到(1)评估植物识别的决策树的潜力,并确定用于区分分类群的信息性状。方法使用16种营养和花卉特征(689种,20属)诱导决策树。我们评估了算法从测试数据中分类种类的分类程度如何,并确定这些特征,这些特征对于各种分类群体来说是重要的。结果未提取的树在我们的数据中正确放置了98%的物种,指示其承诺区分用于构建它们的物种。此外,在修剪的树中,从测试数据集的平均89%的物种被适当地分类为它们的属,展示了决策树的灵活性,也将新物种分类为树内的属。仔细检查显示,16个特征中的七个足以进行分类,这些特征产生大约比不包括的那些初始信息增益更多的两倍。讨论我们的调查结果表明了基于树的机器学习和区分分类群的大数据以及确定哪些性状对于植物鉴定很重要。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号