A Statistical Method for Determining Importance of Variables in an Information System

机译：确定信息系统中变量重要性的统计方法

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

A new method for estimation of attributes' importance for supervised classification, based on the random forest approach, is presented. Essentially, an iterative scheme is applied, with each step consisting of several runs of the random forest program. Each run is performed on a suitably modified data set: values of each attribute found unimportant at earlier steps are randomly permuted between objects. At each step, apparent importance of an attribute is calculated and the attribute is declared unimportant if its importance is not uniformly better than that of the attributes earlier found unimportant. The procedure is repeated until only attributes scoring better than the randomized ones are retained. Statistical significance of the results so obtained is verified. This method has been applied to 12 data sets of biological origin. The method was shown to be more reliable than that based on standard application of a random forest to assess attributes' importance.

机译：提出了一种基于随机森林方法的属性估计对监督分类的重要性的新方法。本质上，应用了一种迭代方案，每个步骤都由随机森林程序的几次运行组成。每次运行都在经过适当修改的数据集上执行：在较早的步骤中发现不重要的每个属性的值在对象之间随机排列。在每个步骤中，都会计算属性的明显重要性，如果该属性的重要性并没有比先前发现的不重要的属性一致地更好，则宣布该属性不重要。重复该过程，直到仅保留得分高于随机属性的属性为止。验证了如此获得的结果的统计意义。此方法已应用于12个生物学来源的数据集。结果表明，该方法比基于随机森林标准评估属性重要性的方法更为可靠。

著录项

来源
《International Conference on Rough Sets and Current Trends in Computing(RSCTC 2006); 20061106-08; Kobe(JP)》|2006年|557-566|共10页
会议地点 Kobe(JP)
作者
Witold R. Rudnicki; Marcin Kierczak; Jacek Koronacki; Jan Komorowski;
展开▼
作者单位

ICM, Waxsaw University, Pawinskiego 5a, Warsaw Poland;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Determining Suitable Investment Areas Using Multi-variable Statistical Methods: Evidence from the Black Sea Region in Turkey [J] . KADRI CEMIL AKYUZ, ILKER AKYUZ, CIGDEM CAVRAR, European Planning Studies . 2004,第8期

机译：使用多元统计方法确定合适的投资区域：来自土耳其黑海地区的证据
2. Variable selection methods in multivariate statistical process control: A systematic literature review [J] . Pimentel Peres Fernanda Araujo, Fogliatto Flavio Sanson Computers & Industrial Engineering . 2018,第JANa期

机译：多元统计过程控制中的变量选择方法：系统文献综述
3. A statistical, task-based evaluation method for three-dimensional x-ray breast imaging systems using variable-background phantoms. [J] . Park S, Jennings R, Liu H, Medical Physics . 2010,第12期

机译：一种基于统计任务的评估方法，用于使用可变背景体模的三维X射线乳腺成像系统。
4. A Statistical Method for Determining Importance of Variables in an Information System [C] . Witold R. Rudnicki, Marcin Kierczak, Jacek Koronacki, International Conference on Rough Sets and Current Trends in Computing(RSCTC 2006); 20061106-08; Kobe(JP) . 2006

机译：确定信息系统中变量重要性的统计方法
5. Nonparametric regression as a general statistical modeling methodology: A Monte Carlo investigation of factors influencing statistical power and robust performance in the presence of moderator variables [D] . McLeod, Jeffrey Thomas. 1998

机译：非参数回归作为一般的统计建模方法：在主持人变量存在的情况下，对影响统计能力和鲁棒性能的因素进行蒙特卡洛研究
6. Statistical Methods Used to Test for Agreement of Medical Instruments Measuring Continuous Variables in Method Comparison Studies: A Systematic Review [O] . Rafdzah Zaki, Awang Bulgiba, Roshidi Ismail, 2009

机译：用于比较方法比较研究中测量连续变量的医疗仪器一致性的统计方法：系统综述
7. A Validated Stability-Indicating RP-HPLC Method for the Simultaneous Determination of Tenofovir, Emtricitabine, and a Efavirenz and Statistical Approach to Determine the Effect of Variables [O] . Prashant S. Devrukhakar, Roshan Borkar, Nalini Shastri, 2013

机译：一种验证的稳定性指示RP-HPLC方法，用于同时测定Tenofovir，Emtricisabine以及eFaviraz和统计方法来确定变量的影响

A Statistical Method for Determining Importance of Variables in an Information System

摘要

著录项

相似文献

相关主题

期刊订阅