Trees Weighting Random Forest Method for Classifying High-Dimensional Noisy Data

机译：用于分类高维噪声数据的树木加权森林方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Random forest is an excellent ensemble learning method, which is composed of multiple decision trees grown on random input samples and splitting nodes on a random subset of features. Due to its good classification and generalization ability, random forest has achieved success in various domains. However, random forest will generate many noisy trees when it learns from the data set that has high dimension with many noise features. These noisy trees will affect the classification accuracy, and even make a wrong decision for new instances. In this paper, we present a new approach to solve this problem through weighting the trees according to their classification ability, which is named Trees Weighting Random Forest (TWRF). Here, Out-Of-Bag, which is the training data subset generated by Bagging and not involved in building decision tree, is used to evaluate the tree. For simplicity, we choose the accuracy as the index that notes tree’s classification ability and set it as the tree’s weight. Experiments show that TWRF has better performance than the original random forest and other traditional methods, such as C45, Naïve Bayes and so on.

机译：随机森林是一种优秀的集合学习方法，它由在随机输入样本上生长的多个决策树和随机特征子集的分离节点组成。由于其良好的分类和泛化能力，随机森林在各个领域取得了成功。然而，当它从具有许多具有许多噪声功能的数据集中学习时，随机森林将生成许多嘈杂的树木。这些嘈杂的树木会影响分类准确性，甚至对新实例做出了错误的决定。在本文中，我们提出了一种通过根据其分类能力加权树木来解决这个问题的新方法，这些方法被称为树加权随机森林（TWRF）。在这里，袋子外，这是由袋装而非涉及构建决策树生成的训练数据子集，用于评估树。为简单起见，我们选择准确性作为指标，即指出树的分类能力并将其设置为树的重量。实验表明，TWRF比原来的随机森林和其他传统方法具有更好的性能，例如C45，Naïve贝叶斯等。

著录项

来源
《IEEE International Conference on e-Business Engineering》|2010年||共4页
会议地点
作者
Li Hong Bo; Wang Wei; Ding Hong Wei; Dong Jin;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 F713-53;
关键词
Data mining; Ensemble learning; classification; random forest;

机译：数据挖掘;集合学习;分类;随机森林;

相似文献

外文文献
中文文献
专利

1. Classifying many-class high-dimensional fingerprint datasets using random forest of oblique decision trees [J] . Thanh-Nghi Do, Philippe Lenca, Stéphane Lallich Vietnam Journal of Computer Science . 2015,第1期

机译：使用倾斜决策树的随机森林对多类高维指纹数据集进行分类
2. Classifying Very High-Dimensional Data with Random Forests Built from Small Subspaces [J] . Baoxun Xu, Joshua Zhexue Huang, Graham Williams International Journal of Data Warehousing and Mining . 2012,第2期

机译：使用从小子空间构建的随机森林对超高维数据进行分类
3. Testing the reliability and stability of the internal accuracy assessment of random forest for classifying tree defoliation levels using different validation methods [J] . Adelabu Samuel, Mutanga Onisimo, Adam Elhadi Geocarto international . 2015,第7a8期

机译：使用不同的验证方法测试随机森林内部准确性评估的可靠性和稳定性，以对树木的落叶程度进行分类
4. Trees Weighting Random Forest Method for Classifying High-Dimensional Noisy Data [C] . Li Hong Bo, Wang Wei, Ding Hong Wei, 7th IEEE International Conference on e-Business Engineering . 2010

机译：树木加权随机森林法分类高维噪声数据
5. Secure Training of Random Forest Classifiers over Continuous Data [D] . Shen, Jianwei. 2020

机译：通过连续数据安全培训随机林分类器
6. Data mining methods in the prediction of Dementia: A real-data comparison of the accuracy sensitivity and specificity of linear discriminant analysis logistic regression neural networks support vector machines classification trees and random forests [O] . João Maroco, Dina Silva, Ana Rodrigues, 2011

机译：痴呆症预测中的数据挖掘方法：线性判别分析逻辑回归神经网络支持向量机分类树和随机森林的准确性敏感性和特异性的真实数据比较
7. Classifying many-class high-dimensional fingerprint datasets using random forest of oblique decision trees [O] . 2015

机译：使用倾斜决策树的随机森林对多类高维指纹数据集进行分类

Trees Weighting Random Forest Method for Classifying High-Dimensional Noisy Data

摘要

著录项

相似文献

相关主题

期刊订阅