Information-based optimal subdata selection for big data logistic regression

首页> 外文期刊>Journal of Statistical Planning and Inference >Information-based optimal subdata selection for big data logistic regression

【24h】

Information-based optimal subdata selection for big data logistic regression

机译：基于信息的大数据逻辑回归的最佳子数据选择

获取原文

获取原文并翻译 | 示例

获取外文期刊封面目录资料

开具论文收录证明 >>

文献代查 >>

文献数据库（团队版） >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Technological advances have enabled an exponential growth in data volumes, and proven statistical methods are no longer applicable for extraordinary large data sets due to computational limitations. Subdata selection is an effective strategy to address this issue. In this study, we investigate existing sampling approaches and propose a novel framework of selecting subsets of data for logistic regression models. We show that, while the information contained in the subdata based on random sampling approaches is limited by the size of the subset, the information contained in the subdata based on the new framework increases as the size of the full data set increases. Performances of the proposed approach and those of other existing methods are compared under various criteria via extensive simulation studies. (C) 2020 Elsevier B.V. All rights reserved.

机译：技术进步在数据卷中启用了指数增长，并且由于计算限制而不再适用于非凡的大数据集。子数据选择是解决此问题的有效策略。在本研究中，我们调查了现有的采样方法，并提出了一种选择逻辑回归模型的数据子集的新框架。我们表明，当基于随机采样方法的子数据中包含的信息受子集的大小限制，而基于新框架的子数据中包含的信息随着完整数据集的大小增加而增加。通过广泛的模拟研究在各种标准中比较所提出的方法的性能和其他现有方法的性能。（c）2020 Elsevier B.V.保留所有权利。

著录项

来源
《Journal of Statistical Planning and Inference》 |2020年第2020期|共11页
作者

展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类统计学;
关键词
D-optimality; Information matrix; Subsampling;

机译：D-最优性;信息矩阵;分支;

相似文献

外文文献
中文文献
专利

1. Information-based optimal subdata selection for big data logistic regression [J] . Journal of Statistical Planning and Inference . 2020,第期

机译：基于信息的大数据逻辑回归的最佳子数据选择
2. Information-Based Optimal Subdata Selection for Big Data Linear Regression [J] . Wang HaiYing, Yang Min, Stufken John Journal of the American statistical association . 2019,第525期

机译：基于信息的大数据线性回归的最优子数据选择
3. A two-stage sparse logistic regression for optimal gene selection in high-dimensional microarray data classification [J] . Algamal Zakariya Yahya, Lee Muhammad Hisyam Advances in data analysis and classification . 2019,第3期

机译：高维微阵列数据分类中最佳基因选择的两阶段稀疏逻辑回归
4. Optimal Feature Selection for Pedestrian Detection Based on Logistic Regression Analysis [C] . Kim Jonghee, Lee Jonghwan, Lee Chungsu, IEEE International Conference on Systems, Man, and Cybernetics . 2013

机译：基于逻辑回归分析的行人检测最优特征选择
5. An Information Based Optimal Subdata Selection Algorithm for Big Data Linear Regression and a Suitable Variable Selection Algorithm. [D] . Zheng, Yi. 2017

机译：大数据线性回归的基于信息的最优子数据选择算法和合适的变量选择算法。
6. A Novel Framework for Predicting In Vivo Toxicities from In Vitro Data Using Optimal Methods for Dense and Sparse Matrix Reordering and Logistic Regression [O] . Peter A. DiMaggio Jr, Ashwin Subramani, Richard S. Judson, -1

机译：使用密集和稀疏矩阵重排序和逻辑回归的最佳方法从体外数据预测体内毒性的新框架
7. Information-Based Optimal Subdata Selection for Big Data Linear Regression [O] . Wang, HaiYing, Yang, Min, Stufken, John 2017

机译：基于信息的大数据线性最优子数据选择回归
8. Optimally Bounded Score Functions for Generalized Linear Models with Applications to Logistic Regression. [R] . Stefanski, L. A., Carroll, R. J., Ruppert, D. 1986

机译：广义线性模型的最优有界分数函数及其在Logistic回归中的应用。

Information-based optimal subdata selection for big data logistic regression

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅