Imputation of missing data with neural networks for classification

Choudhury Suyra Jyoti; Pal Nikhil R.

首页> 外文期刊>Knowledge-Based Systems >Imputation of missing data with neural networks for classification

【24h】

Imputation of missing data with neural networks for classification

机译：使用神经网络对缺失数据进行插补以进行分类

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We propose a mechanism to use data with missing values for designing classifiers which is different from predicting missing values for classification. Our imputation method uses an auto-encoder neural network. We make an innovative use of the training data without missing values to train the autoen-coder so that it is better equipped to predict missing values. It is a two-stage training scheme. Unlike most of the existing auto-encoder based methods which use a bottleneck layer for missing data handling, we justify and use a latent space of much higher dimension than that of the input. Now to design a classifier using a training set with missing values, we use the trained auto-encoder to predict missing values based on the hypothesis that a good choice for a missing value would be the one which can reconstruct itself via the auto-encoder. For this we make an initial guess of the missing value using the nearest neighbor rule and then refine the missing value minimizing the reconstruction error. We train several classifiers using the union of the imputed instances and the remaining training instances without missing values. We also train another classifier of the same type with the same configuration using the corresponding complete dataset. The performances of these classifiers are compared. We compare the proposed method with eight state-of-the-art imputation techniques using fourteen datasets and eight classification strategies. (C) 2019 Elsevier B.V. All rights reserved.

机译：我们提出了一种机制，该机制使用具有缺失值的数据来设计分类器，这与预测用于分类的缺失值不同。我们的插补方法使用自动编码器神经网络。我们创新地使用了训练数据而不会丢失值来训练自动编码器，以便更好地预测丢失的值。这是一个两阶段的培训计划。与大多数现有的基于自动编码器的方法（它们使用瓶颈层来缺少数据处理）不同，我们证明并使用比输入大得多的潜在空间。现在，要使用带有缺失值的训练集来设计分类器，我们使用经过训练的自动编码器来预测缺失值，该假设基于一个缺失值的好选择是可以通过自动编码器自身进行重构的假设。为此，我们使用最近的邻居规则对缺失值进行初始猜测，然后优化缺失值，以最大程度地减少重构误差。我们使用推算实例和其余训练实例的并集来训练多个分类器，而不会遗漏任何值。我们还将使用相应的完整数据集训练具有相同配置的相同类型的另一个分类器。比较这些分类器的性能。我们使用14种数据集和8种分类策略将提出的方法与8种最新的归因技术进行了比较。（C）2019 Elsevier B.V.保留所有权利。

著录项

来源
《Knowledge-Based Systems》 |2019年第15期|104838.1-104838.9|共9页
作者
Choudhury Suyra Jyoti; Pal Nikhil R.;
展开▼
作者单位

Indian Stat Inst Elect & Commun Sci Unit Kolkata 700108 W Bengal India;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Classification; Data imputation; Gradient decent; Missing attribute value; Neural network;

机译：分类;数据归因;梯度下降;缺少属性值;神经网络;

相似文献

外文文献
中文文献
专利

1. Imputation of missing data with neural networks for classification [J] . Choudhury Suyra Jyoti, Pal Nikhil R. Knowledge-Based Systems . 2019,第OCTa15期

机译：使用神经网络对缺失数据进行插补以进行分类
2. HYBRID KERNEL FUZZY CLUSTERING WITH FEED LION NEURAL NETWORK FOR MISSING DATA IMPUTATION AND CLASSIFICATION [J] . R RAJANI, T.SUDHA International Journal of Computer Science Engineering and Information Technology Research . 2017,第2期

机译：混合核模糊聚类与狮子神经网络的数据归类和分类。
3. A new analytical framework for missing data imputation and classification with uncertainty: Missing data imputation and heart failure readmission prediction [J] . Zhiyong Hu, Dongping Du PLoS One . 2020,第9期

机译：一种新的分析框架，用于缺少数据避难和不确定性的分类：缺少数据归档和心力衰竭入读预测
4. Combining Imputation and Classification in a Single 3Recurrent Neural Network for Robust ASR with Missing Data [C] . S. Parveen, P.D. Green, M.R. Khan Engineering of Intelligent Systems(EIS'2004): A . 2004

机译：在单个3递归神经网络中将归因和分类相结合以实现具有丢失数据的鲁棒ASR
5. The Effect of a Missing at Random Missing Data Mechanism on a Single Layer Artificial Neural Network with a Sigmoidal Activation Function and the Use of Multiple Imputation as a Correction. [D] . Dick, Taron. 2017

机译：随机丢失数据机制上的丢失对具有S型激活函数的单层人工神经网络的影响，以及使用多重插补作为校正。
6. A new analytical framework for missing data imputation and classification with uncertainty: Missing data imputation and heart failure readmission prediction [O] . Zhiyong Hu, Dongping Du 2020

机译：一种新的分析框架用于缺少数据避难和不确定性分类：缺少数据归档和心力衰竭入读预测
7. Hybrid Kernel Fuzzy Clustering with Feed Lion Neural Network for Missing Data Imputation and Classification [O] . R Rajani et al. R Rajani et al. 2017

机译：混合内核模糊聚类与饲料狮子神经网络缺失数据避难和分类

Imputation of missing data with neural networks for classification

摘要

著录项

相似文献

相关主题

期刊订阅