Evaluation of Combining Bootstrap with Multiple Imputation Using R on Knights Landing Platform

机译：使用R在骑士登陆平台上使用多重估算的自由释放

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Cloud computing and big data technologies are converging to offer a cost-effective delivery model for cloud-based big data analytics. Though impacts of size and scaling of big data on cloud have been extensively studied, the effects of complexity of underlying analytic methods on cloud performance have received less attention. This paper will develop and evaluate a computationally intensive statistical methodology to perform inference in the presence of both non-Gaussian data and missing data. Two well-established statistical approaches, bootstrap and multiple imputations (MI), will be combined to form the methodology. Bootstrap is a computer-based nonparametric resampling procedure that involves randomly selecting data many thousands of times to construct an empirical distribution, which is then used to construct confidence intervals for significance tests. This statistical technique enables scientists who conduct studies on data with known non-normality to obtain higher quality significance tests than is possible with a traditional asymptotic, normal-theory based significance test. However, the bootstrapping procedure only works when no data are missing or the data are missing completely at random (MCAR). Missing data can lead to biased estimates when the MCAR assumption is violated. It is unclear how to best implement a bootstrapping procedure in the presence of missing data. The proposed methods will provide guidelines and procedures that will enable researchers to use the technique in all areas of health, behavior and developmental science in which a study has missing data and cannot rely on parametric inference. Either bootstrapping or MI can be computationally expensive, and combining these two can lead to further computation costs in the cloud. Using carefully constructed simulation examples, we demonstrate that it is feasible to implement the proposed methodology in a high performance Knights Landing platform. However, the computation costs are substantial even with small data size. Further studies are needed to study the effects of optimizing the implementation and its performance with big data.

机译：云计算和大数据技术正在融合，为基于云的大数据分析提供成本有效的交付模型。虽然已经广泛研究了大量数据的大小和缩放的影响，但是已经广泛研究了云层对云性能的潜在分析方法的复杂性的影响。本文将开发和评估计算密集型统计方法，以在存在非高斯数据和缺少数据的情况下执行推断。将组合两个熟悉的统计方法，引导和多避雷（MI）以形成方法。 Bootstrap是一种基于计算机的非参数重采样过程，涉及随机选择数千次的数据来构建经验分布，然后用于构建重要性测试的置信区间。这种统计技术使科学家能够对具有已知非正常性的数据进行研究，以获得比传统的渐近正常理论的重要性测试能够获得更高的质量意义测试。但是，启动过程仅在缺少数据时仅适用，或者在随机丢失数据（MCAR）。缺少数据可能导致违反MCAR假设时偏置估计。目前尚不清楚如何在存在缺失数据的情况下最佳实现自动启动过程。该拟议的方法将提供指导和程序，使研究人员能够在研究中使用该技术，其中一项研究缺失数据，不能依赖参数推断。 Bootstraping或MI可以计算地昂贵，并且组合这两个可以导致云中进一步的计算成本。使用仔细构造的仿真示例，我们证明在高性能骑士着陆平台中实现所提出的方法是可行的。然而，即使数据大小小，计算成本也很大。需要进一步的研究来研究优化实现及其性能与大数据的影响。

著录项

来源
《IEEE International Conference on Cyber Security and Cloud Computing》|2017年|376p|共4页
会议地点
作者
Chuan Zhou; Yuxiang Gao; Waylon Howard;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP393-53;
关键词
Cloud computing; Servers; Big Data; Computational modeling; Sociology; Statistics;

机译：云计算;服务器;大数据;计算建模;社会学;统计;

相似文献

外文文献
中文文献
专利

1. Combining multiple imputation and bootstrap in the analysis of cost-effectiveness trial data [J] . Brand Jaap, van Buuren Stef, le Cessie Saskia, Statistics in medicine . 2019,第2期

机译：在成本效益试验数据分析中结合多重估算和自举
2. Combining kNN Imputation and Bootstrap Calibrated Empirical Likelihood for Incomplete Data Analysis [J] . Yongsong Qin, Shichao Zhang, Chengqi Zhang International Journal of Data Warehousing and Mining . 2010,第4期

机译：结合kNN归因和Bootstrap校准的经验可能性进行不完整数据分析
3. Single imputation with multilayer perceptron and multiple imputation combining multilayer perceptron and k-nearest neighbours for monotone patterns [J] . Silva-Ramireza Esther-Lydia, Pino-Mejias Rafael, Lopez-Coello Manuel Applied Soft Computing . 2015,第Null期

机译：带有多层感知器的单插补和结合多层感知器和k近邻的多重插补的单调模式
4. Evaluation of Combining Bootstrap with Multiple Imputation Using R on Knights Landing Platform [C] . Chuan Zhou, Yuxiang Gao, Waylon Howard IEEE International Conference of Scalable and Smart Cloud;IEEE International Conference on Cyber Security and Cloud Computing . 2017

机译：在Knights登陆平台上使用R评估Bootstrap与多重插补的组合
5. An Analysis of Variation Between Cores for Intel Xeon Phi Knights Corner and Xeon Phi Knights Landing. [D] . Robinson, Jamar. 2017

机译：英特尔至强披披骑士角和至强披披骑士登陆的内核之间的差异分析。
6. Combining multiple imputation and bootstrap in the analysis of cost‐effectiveness trial data [O] . Jaap Brand, Stef van Buuren, Saskia le Cessie, -1

机译：在成本效益试验数据分析中结合多种估算和引导
7. Combining multiple imputation and bootstrap in the analysis of cost‐effectiveness trial data [O] . Jaap Brand, Stef Buuren, Saskia Cessie, 2018

机译：在成本效益试验数据分析中结合多重归纳和自举
8. Mobile Landing Platform with Core Capability Set (MLP w/CCS): Combined Initial Operational Test and Evaluation and Live Fire Test and Evaluation Report. [R] . 2015

机译：具有核心能力集的移动着陆平台（mLp w / CCs）：联合初始运行测试和评估以及实时火灾测试和评估报告。

Evaluation of Combining Bootstrap with Multiple Imputation Using R on Knights Landing Platform

摘要

著录项

相似文献

相关主题

期刊订阅