Missing Value Estimation for Mixed-Attribute Data Sets

Zhu Xiaofeng; Zhang Shichao; Jin Zhi; Zhang Zili; Xu Zhuoming

首页> 外文期刊>Knowledge and Data Engineering, IEEE Transactions on >Missing Value Estimation for Mixed-Attribute Data Sets

【24h】

Missing Value Estimation for Mixed-Attribute Data Sets

机译：混合属性数据集的缺失值估计

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Missing data imputation is a key issue in learning from incomplete data. Various techniques have been developed with great successes on dealing with missing values in data sets with homogeneous attributes (their independent attributes are all either continuous or discrete). This paper studies a new setting of missing data imputation, i.e., imputing missing data in data sets with heterogeneous attributes (their independent attributes are of different types), referred to as imputing mixed-attribute data sets. Although many real applications are in this setting, there is no estimator designed for imputing mixed-attribute data sets. This paper first proposes two consistent estimators for discrete and continuous missing target values, respectively. And then, a mixture-kernel-based iterative estimator is advocated to impute mixed-attribute data sets. The proposed method is evaluated with extensive experiments compared with some typical algorithms, and the result demonstrates that the proposed approach is better than these existing imputation methods in terms of classification accuracy and root mean square error (RMSE) at different missing ratios.

机译：缺少数据归因是从不完整数据中学习的关键问题。在处理具有同类属性（它们的独立属性都是连续的或离散的）的数据集中的缺失值方面，已经开发了各种技术，并取得了巨大的成功。本文研究了缺失数据插补的一种新设置，即在具有异构属性（它们的独立属性为不同类型）的数据集中插补缺失数据，称为插补混合属性数据集。尽管此设置中有许多实际应用程序，但没有为估算混合属性数据集而设计的估计器。本文首先针对离散和连续缺失目标值分别提出了两个一致的估计量。然后，提倡基于混合核的迭代估计器来估算混合属性数据集。通过与大量典型算法进行比较，通过大量实验对提出的方法进行了评估，结果表明，在分类丢失率和均方根误差（RMSE）方面，提出的方法优于现有的归因方法。

著录项

来源
《Knowledge and Data Engineering, IEEE Transactions on》 |2011年第1期|p.110-121|共12页
作者
Zhu Xiaofeng; Zhang Shichao; Jin Zhi; Zhang Zili; Xu Zhuoming;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Classification; data mining; machine learning.; methodologies;

机译：分类;数据挖掘;机器学习;方法;

相似文献

外文文献
中文文献
专利

1. Projected Outlier Detection In High-dimensional Mixed-attributes Data Set [J] . Mao Ye, Xue Li, Maria E. Orlowska Expert systems with applications . 2009,第3p2期

机译：高维混合属性数据集中的投影离群值检测
2. Fast distributed outlier detection in mixed-attribute data sets [J] . Otey ME, Ghoting A, Parthasarathy S Data mining and knowledge discovery . 2006,第2a3期

机译：混合属性数据集中的快速分布式异常值检测
3. Distance estimation in numerical data sets with missing values [J] . Eirola E., Doquire G., Verleysen M., Information Sciences: An International Journal . 2013,第Null期

机译：缺失值的数值数据集中的距离估计
4. Detecting Network Anomalies in Mixed-Attribute Data Sets [C] . Khoi-Nguyen Tran, Huidong Jin Knowledge Discovery and Data Mining, 2010. WKDD '10 . 2010

机译：在混合属性数据集中检测网络异常
5. Missing data in multivariate longitudinal studies: Comparing results from different missing data techniques using an empirical data set. [D] . Jelicic, Helena. 2007

机译：多元纵向研究中的缺失数据：使用经验数据集比较不同缺失数据技术的结果。
6. An alternative data filling approach for prediction of missing data in soft sets (ADFIS) [O] . Muhammad Sadiq Khan, Mohammed Ali Al-Garadi, Ainuddin Wahid Abdul Wahab, -1

机译：预测软集中缺失数据的另一种数据填充方法（ADFIS）
7. Fast distributed outlier detection in mixed-attribute data sets [O] . Matthew Eric Otey, Amol Ghoting, Srinivasan Parthasarathy 2006

机译：混合属性数据集中的快速分布式异常值检测

Missing Value Estimation for Mixed-Attribute Data Sets

摘要

著录项

相似文献

相关主题

期刊订阅