Imputation of missing values for semi-supervised data using the proximity in random forests

Tsunenori Ishioka

首页> 外文期刊>International Journal of Business Intelligence and Data Mining >Imputation of missing values for semi-supervised data using the proximity in random forests

【24h】

Imputation of missing values for semi-supervised data using the proximity in random forests

机译：使用随机森林中的邻近度来估算半监督数据的缺失值

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents a procedure that imputes missing values by using random forests on semi-supervised data. Applying our method to Hewlett-Packard Lab.'s spam data and Edgar Anderson's iris data, we found that the rate of correct classification is higher than that of other methods: a simple expansion of Liaw's 'rflmpute' for (un)supervised data and the fc-nearest neighbour method (kNN). Our method allows missing predictor variables as well as missing response variable. An imputation that uses random forests for semi-supervised cases in the training dataset has never been implemented until now.

机译：本文提出了一种通过对半监督数据使用随机森林来插补缺失值的过程。将我们的方法应用于Hewlett-Packard Lab。的垃圾邮件数据和Edgar Anderson的虹膜数据，我们发现正确分类的比率高于其他方法：对（无）受监督数据的Liaw'rflmpute'的简单扩展以及fc最近邻法（kNN）。我们的方法允许缺少预测变量以及缺少响应变量。到目前为止，从未实施过将训练数据集中的半监督案例使用随机森林的推算。

著录项

来源
《International Journal of Business Intelligence and Data Mining》 |2013年第2期|155-166|共12页
作者
Tsunenori Ishioka;
展开▼
作者单位

Research Division, The National Center for University Entrance Examinations, 2-19-23 Komaba, Meguro-ku, Tokyo 153-8501, Japan;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
ensemble learning; data imputation; missing data; fc-nearest neighbour; kNN; R; rflmpute; random forests; semi-supervised learning;

机译：整体学习;数据归因;缺失数据;fc最近邻居;kNN;R;rflmpute;随机森林半监督学习;

相似文献

外文文献
中文文献
专利

1. Imputation of missing well log data by random forest and its uncertainty analysis [J] . Feng Runhai, Grana Dario, Balling Niels Computers & geosciences . 2021,第Jula期

机译：随机森林缺失井日志数据的归责及其不确定性分析
2. Accuracy of random-forest-based imputation of missing data in the presence of non-normality, non-linearity, and interaction [J] . Shangzhi Hong, Henry S. Lynn BMC Medical Research Methodology . 2020,第1期

机译：在存在非正常，非线性和相互作用的存在下缺失数据的随机森林的归责的准确性
3. Comparison of random forest and parametric imputation models for imputing missing data using MICE: a CALIBER study. [J] . Anoop D Shah, Jonathan W Bartlett, James Carpenter, American Journal of Epidemiology . 2014,第6期

机译：使用MICE插补缺失数据的随机森林插补模型和参数插补模型的比较：CALIBER研究。
4. Imputation of Missing Values for Unsupervised Data Using the Proximity in Random Forests [C] . Tsunenori Ishioka International Conference on Mobile, Hybrid, and On-Line Learning . 2013

机译：在随机林中使用邻近的无监督数据缺失值的归责
5. Assessing if randomized treatment group should be included in the imputation model when imputing missing outcome data in randomized superiority clinical trials. [D] . Lyass, Asya. 2010

机译：在随机优势临床试验中估算缺失结果数据时，评估是否应将随机治疗组纳入估算模型。
6. Accuracy of random-forest-based imputation of missing data in the presence of non-normality non-linearity and interaction [O] . Shangzhi Hong, Henry S. Lynn 2020

机译：在存在非正常非线性和相互作用的存在下缺失数据的随机森林的归责的准确性
7. Sequential Imputation of Missing Spatio-Temporal Precipitation Data Using Random Forests [O] . Utkarsh Mital, Dipankar Dwivedi, James B. Brown, 2020

机译：随机林的缺失时空降水数据的顺序归档

Imputation of missing values for semi-supervised data using the proximity in random forests

摘要

著录项

相似文献

相关主题

期刊订阅