首页> 外文OA文献 >Hierarchical ensemble classification: towards the classification of data collections that feature large numbers of class labels
【2h】

Hierarchical ensemble classification: towards the classification of data collections that feature large numbers of class labels

机译:层次集成分类:针对具有大量类标签的数据收集的分类

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

In this thesis a number of hierarchical ensemble classification approaches are proposed as a solution to the multi-class classification problem. The central idea is that a more effective classification can be produced if a “coarse-grain” classification (directed at groups of classes) is first conducted followed by increasingly more “fine-grain” classifications. The Hierarchical ensemble classification model comprises a set of base classifiers held within the nodes of the hierarchy (one classifier per node). Nodes near the root hold classifiers designed to discriminate between groups of class labels while the leaves hold classifiers designed to distinguish between individual class labels. Two types of hierarchy (structures) are considered, Binary Tree (BT) hierarchies and Directed Acyclic Graph (DAG) hierarchies. With respect to the DAG structure, two alternative DAG structures to support the generation of the desired hierarchical ensemble classification model are considered: (i) rooted DAG, and (ii) non-rooted DAG. The main challenges are: (i) how best to distribute class labels between nodes within the hierarchy, (ii) how to address the “successive mis-classification” issue associated with hierarchical classification where if a mis-classication occurs early on in the process (near the root of the hierarchy) there is no possibility of rectifying this error later on in the process, and (iii) how best to determine the starting node within the non-rooted DAG approach. To address the first issue different techniques, based on the concepts of clustering, splitting, and combination, are proposed. To address the second and the third issues the idea is to utilise probability or confidence values associated with Naive Bayes and CARM classifiers respectively to dictate whether single or multiple paths should be followed at each hierarchy node, and to select the best starting DAG node with respect to the non-rooted DAG approach.
机译:本文提出了多种层次的集成分类方法来解决多分类问题。中心思想是,如果首先进行“粗粒度”分类(针对各组类别),然后再进行更多的“细粒度”分类,则可以产生更有效的分类。分层整体分类模型包括一组保存在层次结构节点内的基本分类器(每个节点一个分类器)。根附近的节点拥有旨在区分类别标签组的分类器,而叶子拥有旨在区分各个类别标签的分类器。考虑了两种类型的层次结构(结构):二叉树(BT)层次结构和有向无环图(DAG)层次结构。关于DAG结构,考虑了两个可选的DAG结构以支持所需的层次集成分类模型的生成:(i)根DAG,和(ii)非根DAG。主要挑战是:(i)如何最好地在层次结构内的节点之间分配类标签,(ii)如何解决与层次结构分类相关的“成功的错误分类”问题,如果流程早期出现错误分类(在层次结构的根附近)不可能再在此过程中纠正此错误,并且(iii)如何最好地确定非根DAG方法中的起始节点。为了解决第一个问题,提出了基于聚类,拆分和组合概念的不同技术。为了解决第二个和第三个问题,其思想是利用分别与朴素贝叶斯和CARM分类器关联的概率或置信度值来指示在每个层次结构节点上应遵循单条路径还是多条路径,并就其选择最佳的起始DAG节点。不扎根的DAG方法。

著录项

  • 作者

    Alshdaifat E;

  • 作者单位
  • 年度 2000
  • 总页数
  • 原文格式 PDF
  • 正文语种 en
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号