Distributed Neural Networks for Missing Big Data Imputation

机译：缺少大数据插补的分布式神经网络

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper we investigate the use of Distributed Neural Networks for the imputation of missing values in Big Data context. The presented framework for data imputation is implemented in Spark, allowing easy imputation as an additional step to the data pre-processing pipeline. The Distributed Neural Networks model is using Mini-batch Stochastic Gradient Descent, scaling well with the cluster size and minimizing the communication among the workers. The model is tested on a real-world Recommender Systems dataset, where the missing data is generally a problem for new items, as the systems ranking is usually biased towards the popular items. The model is compared with univariate (Mean and Median Imputation) and multivariate (K-Nearest Neighbours and Linear Regression) imputation techniques, and its performance is validated using prediction accuracy and speed. Furthermore, we evaluate the speedup compared to the sequential implementation of Neural Networks with Stochastic Gradient Descent. 展开▼

机译：在本文中，我们调查了分布式神经网络在大数据上下文中缺失值的归咎。呈现的数据归档框架以火花实现，允许轻松归档作为数据预处理管道的额外步骤。分布式神经网络模型正在使用迷你批量随机梯度下降，与群集大小缩放，并最大限度地减少工人之间的通信。该模型在现实世界推荐系统数据集上进行测试，其中缺失的数据通常是新项目的问题，因为系统排名通常偏向流行项目。该模型与单变量进行比较（均值和中位数归责）和多变量（K-CORMATE邻居和线性回归）拒绝技术，并使用预测精度和速度验证其性能。此外，与具有随机梯度下降的神经网络的连续实现相比，我们评估了加速。 展开▼

著录项

来源
《International Joint Conference on Neural Networks》|2018年|1-8|共8页

会议地点

作者
Alessio Petrozziello; Ivan Jordanov; Christian Sommeregger;
展开▼

作者单位

展开▼

会议组织

原文格式 PDF

正文语种

中图分类

关键词
Sparks; Neural networks; Machine learning; Big Data; Task analysis; Pipelines; Computational modeling;

机译：火花;神经网络;机器学习;大数据;任务分析;管道;计算模型;

相似文献

外文文献

中文文献

专利

1. Fuzzy min-max neural networks for categorical data: application to missing data imputation [J] . Pilar Rey-del-Castillo, Jesus Cardenosa Neural computing & applications . 2012,第6期

机译：用于分类数据的模糊最小-最大神经网络：在缺失数据插补中的应用

2. Imputation of missing data with neural networks for classification [J] . Choudhury Suyra Jyoti, Pal Nikhil R. Knowledge-Based Systems . 2019,第Octa15期

机译：使用神经网络对缺失数据进行插补以进行分类

3. Imputation of missing data with neural networks for classification [J] . Choudhury Suyra Jyoti, Pal Nikhil R. Knowledge-Based Systems . 2019,第OCTa15期

机译：使用神经网络对缺失数据进行插补以进行分类

4. Distributed Neural Networks for Missing Big Data Imputation [C] . Alessio Petrozziello, Ivan Jordanov, Christian Sommeregger International Joint Conference on Neural Networks . 2018

机译：分布式神经网络缺少大数据估算

5. The Effect of a Missing at Random Missing Data Mechanism on a Single Layer Artificial Neural Network with a Sigmoidal Activation Function and the Use of Multiple Imputation as a Correction. [D] . Dick, Taron. 2017

机译：随机丢失数据机制上的丢失对具有S型激活函数的单层人工神经网络的影响，以及使用多重插补作为校正。

6. A new analytical framework for missing data imputation and classification with uncertainty: Missing data imputation and heart failure readmission prediction [O] . Zhiyong Hu, Dongping Du 2020

机译：一种新的分析框架用于缺少数据避难和不确定性分类：缺少数据归档和心力衰竭入读预测

7. Distributed Neural Networks for Missing Big Data Imputation [O] . Alessio Petrozziello, Ivan Jordanov, Christian Sommeregger 2018

机译：分布式神经网络缺少大数据估算

1. 基于SERCOS接口的分布式离线插补数控系统设计 [J] . 赵涛 ,黄大贵 . 机械设计与制造 . 2010,第004期

2. 基于多重插补神经网络模型的减压病人危险率变化估计 [J] . 王纯杰 ,任美慧 ,肖男男 . 吉林师范大学学报（自然科学版） . 2022,第001期

3. 基于BP神经网络方法的风电场风速插补分析应用 [J] . 郑侃 ,魏煜锋 ,文智胜 . 南方能源建设 . 2021,第001期

4. 基于组合神经网络模型的球磨机数据插补方法研究 [J] . 孟巍 ,王智强 ,叶茂 . 现代矿业 . 2021,第007期

5. 基于遗传-神经网络数控系统插补控制技术 [J] . 程一夫 ,王凯 ,薛会民 . 机床与液压 . 2020,第005期

6. 基于混搭存储引擎的融合型分布式数据库架构——服务型分布式计算和混搭型分布式数据存储助力大数据时代的数据宝藏挖掘 [C] . 董建 . 2015第六届中国数据库技术大会（DTCC） . 2015

7. 神经网络在自由曲线插补中的应用研究 [A] . 李华兵 . 2019

1. 基于BP神经网络的NURBS曲线自适应插补方法 [P] . 中国专利： CN112631205A . 2021-04-09

2. 基于多元时间序列插补的图神经网络交通流预测方法 [P] . 中国专利： CN113673769A . 2021-11-19

3. Systems and methods for predicting information handling resource failures using deep recurrent neural network with a modified gated recurrent unit having missing data imputation [P] . 外国专利： US11227209B2 . 2022-01-18

机译：用于预测信息处理资源失败的系统和方法，使用深频神经网络具有具有缺失数据归档的修改后的复发单元

4. MISSING IMAGE DATA IMPUTATION METHOD USING NEURAL NETWORK AND APPARATUS THEREFOR [P] . 外国专利： WO2020197239A1 . 2020-10-01

机译：基于神经网络和装置的缺失图像数据归因方法

5. METHOD FOR MISSING IMAGE DATA IMPUTATION USING NEURAL NETWORK AND APPARATUS THEREFOR [P] . 外国专利： KR20200115001A . 2020-10-07

机译：基于神经网络的图像数据归纳方法及装置

相关主题

Distributed Neural Networks for Missing Big Data Imputation

摘要

著录项

相似文献

相关主题

期刊订阅