首页> 外文会议>Multiple classifier systems >Random Ordinality Ensembles: A Novel Ensemble Method for Multi-valued Categorical Data

【24h】

Random Ordinality Ensembles: A Novel Ensemble Method for Multi-valued Categorical Data

机译：随机序数合奏：一种用于多值分类数据的新颖合奏方法

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Data with multi-valued categorical attributes can cause major problems for decision trees. The high branching factor can lead to data fragmentation, where decisions have little or no statistical support. In this paper, we propose a new ensemble method, Random Ordinality Ensembles (ROE), that circumvents this problem, and provides significantly improved accuracies over other popular ensemble methods. We perform a random projection of the categorical data into a continuous space by imposing random ordinality on categorical attribute values. A decision tree that learns on this new continuous space is able to use binary splits, hence avoiding the data fragmentation problem. A majority-vote ensemble is then constructed with several trees, each learnt from a different continuous space. An empirical evaluation on 13 datasets shows this simple method to significantly outperform standard techniques such as Boosting and Random Forests. Theoretical study using an information gain framework is carried out to explain RO performance. Study shows that ROE is quite robust to data fragmentation problem and Random Ordinality (RO) trees are significantly smaller than trees generated using multi-way split.

机译：具有多值分类属性的数据可能导致决策树出现重大问题。高分支因子可能导致数据碎片化，而决策很少或没有统计支持。在本文中，我们提出了一种新的合奏方法，即随机序数合奏（ROE），它可以解决此问题，并且与其他流行的合奏方法相比，其准确性大大提高。通过对类别属性值施加随机序数，我们可以将类别数据随机投影到连续空间中。在这个新的连续空间上学习的决策树能够使用二进制拆分，从而避免了数据碎片问题。然后，由几棵树构成多数票合奏，每棵树都是从不同的连续空间中学到的。对13个数据集的经验评估表明，该简单方法明显优于标准技术，如Boosting和Random Forests。使用信息获取框架进行了理论研究，以解释反渗透性能。研究表明，ROE对数据碎片问题非常健壮，并且随机序数（RO）树比使用多路拆分生成的树小得多。

著录项

来源
《Multiple classifier systems》|2009年|222-231|共10页
会议地点 Reykjavik(IS);Reykjavik(IS)
作者
Amir Ahmad; Gavin Brown;
展开▼
作者单位

School of Computer Science, University of Manchester, Manchester, M13 9PL, UK;

School of Computer Science, University of Manchester, Manchester, M13 9PL, UK;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类 TP274.3;
关键词
decision trees; data fragmentation; random ordinality; binary splits; multi-way splits;

机译：决策树；数据碎片；随机顺序二进制拆分多路拆分;

相似文献

外文文献
中文文献
专利

1. Random Ordinality Ensembles: Ensemble methods for multi-valued categorical data [J] . Ahmad Amir, Brown Gavin Information Sciences: An International Journal . 2015,第Null期

机译：随机序数集合：多值分类数据的集合方法
2. Use of an Ensemble Re-ordering Method for disaggregation of seasonal categorical rainfall forecasts into conditioned ensembles of daily rainfall for hydrological forecasting [J] . Ghile Y. B., Schulze R. E. Journal of Hydrology . 2009,第1a4期

机译：使用集合重新排序方法将季节性分类降雨预报分解为每日降雨的条件组合以进行水文预报
3. A cluster ensemble method for clustering categorical data [J] . Zengyou He, Xiaofei Xu, Shengchun Deng Information Fusion . 2005,第2期

机译：聚类分类数据的聚类集成方法
4. Random Ordinality Ensembles: A Novel Ensemble Method for Multi-valued Categorical Data [C] . Amir Ahmad, Gavin Brown International Workshop on Multiple Classifier Systems . 2009

机译：随机常数集成：一种用于多值分类数据的新型集合方法
5. Fractional Random Weighted Bootstrapping for Classi?cation on Imbalanced Data with Ensemble Decision Tree Methods [D] . Carter, Sean Charles. 2019

机译：具有集合决策树方法的分数随机加权自动启动，用于分类数据
6. EnsCat: clustering of categorical data via ensembling [O] . Bertrand S. Clarke, Saeid Amiri, Jennifer L. Clarke 2016

机译：EnsCat：通过集合聚类分类数据
7. STATISTICAL RANK METHODS FOR ORDINAL CATEGORICAL DATA [O] . Holm Sture, Svensson Elisabeth 1991

机译：正常类别数据的统计排序方法
8. Ensemble Data Assimilation Without Ensembles: Methodology and Application to Ocean Data Assimilation. [R] . Keppenne, C. L., Rienecker, M. M., Kovach, R. M., 2013

机译：没有集合的集合数据同化：海洋数据同化的方法和应用。

Random Ordinality Ensembles: A Novel Ensemble Method for Multi-valued Categorical Data

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅