首页> 美国卫生研究院文献>PLoS Clinical Trials >Divide and conquer! Data-mining tools and sequential multivariate analysis to search for diagnostic morphological characters within a plant polyploid complex (Veronica subsect. Pentasepalae, Plantaginaceae)
【2h】

Divide and conquer! Data-mining tools and sequential multivariate analysis to search for diagnostic morphological characters within a plant polyploid complex (Veronica subsect. Pentasepalae, Plantaginaceae)

机译:分而治之!数据挖掘工具和顺序多变量分析,以在植物多倍体复合体(Veronica亚科,Pentasepalae,Plantaginaceae)中搜索诊断形态特征

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

This study exhaustively explores leaf features seeking diagnostic characters to aid the classification (assigning cases to groups, i.e. populations to taxa) in a polyploid plant-species complex. A challenging case study was selected: Veronica subsection Pentasepalae, a taxonomically intricate group. The “divide and conquer” approach was implemented—that is, a difficult primary dataset was split into more manageable subsets. Three techniques were explored: two data-mining tools (artificial neural networks and decision trees) and one unsupervised discriminant analysis. However, only the decision trees and discriminant analysis were finally used to select diagnostic traits. A previously established classification hypothesis based on other data sources was used as a starting point. A guided discriminant analysis (i.e. involving manual character selection) was used to produce a grouping scheme fitting this hypothesis so that it could be taken as a reference. Sequential unsupervised multivariate analysis enabled the recognition of all species and infraspecific taxa; however, a suboptimal classification rate was achieved. Decision trees resulted in better classification rates than unsupervised multivariate analysis, but three complete taxa were misidentified (not present in terminal nodes). The variable selection led to a different grouping scheme in the case of decision trees. The resulting groups displayed low misclassification rates when analyzed using artificial neural networks. The decision trees as well as the discriminant analysis are recommended in the search of diagnostic characters. Due to the high sensitivity that artificial neural networks have to the combination of input/output layers, they are proposed as evaluation tools for morphometric studies. The “divide and conquer” principle is a promising strategy, providing success in the present case study.
机译:这项研究详尽地探索了寻找诊断特征的叶片特征,以帮助进行多倍体植物物种复合体的分类(将案例分配给群体,即将种群分配给分类群)。选择了具有挑战性的案例研究:分类分类复杂的Veronica小节Pentasepalae。实施了“分而治之”的方法-也就是说,将困难的主要数据集划分为更易于管理的子集。探索了三种技术:两种数据挖掘工具(人工神经网络和决策树)和一种无监督判别分析。但是,只有决策树和判别分析最终用于选择诊断性状。基于其他数据源的先前建立的分类假设被用作起点。指导性判别分析(即涉及手动字符选择)被用来产生适合该假设的分组方案,因此可以作为参考。连续无监督的多元分析可以识别所有物种和亚种。但是,分类率不理想。决策树比无监督的多元分析产生更好的分类率,但是三个完整的分类单元被错误地识别(终端节点中不存在)。在决策树的情况下,变量选择导致了不同的分组方案。使用人工神经网络进行分析时,所得的组显示出较低的误分类率。在搜索诊断字符时,建议使用决策树以及判别分析。由于人工神经网络对输入/输出层的组合具有很高的敏感性,因此它们被建议用作形态学研究的评估工具。 “分而治之”原则是一种有前途的策略,在本案例研究中取得了成功。

著录项

代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号