Model Selection and Variable Aggregation of Australian Hospital Data

机译：澳大利亚医院数据的模型选择与变聚

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Background Hospital administrative data commonly consist of hundreds of variables with many consisting of hundreds, if not thousands, of distinct categories, especially for disease groups. Conventional approaches to develop regression models for prediction either fail completely due to multicollinearity or sparsity issues or take too long and consume too many computer resources. Methods We demonstrate how regularisation and variable aggregation techniques such as Elastic Net can overcome some of these problems. Parameter estimates from univariate generalised linear models (GLM) and Elastic Net models were used to aggregate disease groups into a more manageable number and predict the probability of mortality for a given patient. Results When employed for variable aggregation and variable selection, Elastic Net models ran at least four times faster than GLMs, though producing a less discriminative model. When applied to final models for predicting hospital mortality, though, both Elastic Net and GLM models demonstrated similar predictive power and efficiently solved an otherwise complex problem. Conclusion Elastic Net regularisation and variable aggregation provide an efficient mechanism for solving healthcare modelling problems.

机译：背景技术医院管理数据通常由数百个变量组成，其中许多包括数百个，如果不是数千个，特别是疾病群体。常规方法为了提高预测的回归模型或者由于多含量或稀疏问题或花费太长并且消耗太多计算机资源而完全失败。方法我们展示了如何正则化和可变聚合技术，如弹性网可以克服其中一些问题。单变量推广线性模型（GLM）和弹性网模型的参数估计用于将疾病组聚集成更可管理的数量，并预测给定患者的死亡率。结果当用于可变聚合和可变选择时，弹性网模型的速度比GLM更快地运行，尽管产生了较少的鉴别模型。然而，当应用于预测医院死亡率的最终模型时，弹性网和GLM模型都表现出类似的预测力，有效地解决了一个复杂的问题。结论弹性净正规化和可变聚合提供了解决医疗保健建模问题的有效机制。

著录项

来源
《Australian National Health Informatics Conference》|2015年||共6页
会议地点
作者
Liam HEINIGER; Norm GOOD Sankalp KHANNA;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 R-058;
关键词

相似文献

外文文献
中文文献
专利

1. Early stopping aggregation in selective variable selection ensembles for high-dimensional linear regression models [J] . Zhang Chun-Xia, Zhang Jiang-She, Yin Qing-Yan Knowledge-Based Systems . 2018,第AUGa1期

机译：高维线性回归模型的选择性变量选择集合中的提早停止聚集
2. Learning equivalence classes of acyclic models with latent and selection variables from multiple datasets with overlapping variables [J] . Peter Spirtes, Robert Tillman JMLR: Workshop and Conference Proceedings . 2011,第2011期

机译：从具有重叠变量的多个数据集中学习具有潜变量和选择变量的非循环模型的等价类
3. A tutorial on variable selection for clinical prediction models: feature selection methods in data mining could improve the results [J] . Bagherzadeh-Khiabani Farideh, Ramezankhani Azra, Azizi Fereidoun, Journal of Clinical Epidemiology . 2016,第Null期

机译：有关临床预测模型的变量选择的教程：数据挖掘中的特征选择方法可以改善结果
4. Model Selection and Variable Aggregation of Australian Hospital Data [C] . Liam HEINIGER, Norm GOOD Sankalp KHANNA Australian National Health Informatics Conference . 2015

机译：澳大利亚医院数据的模型选择与变聚
5. A Review of 'Big Data' Variable Selection Procedures for Use in Predictive Modeling. [D] . Papke, Sarah. 2017

机译：预测模型中使用的“大数据”变量选择程序的回顾。
6. Variable selection models for genomic selection using whole-genome sequence data and singular value decomposition [O] . Theo H. E. Meuwissen, Ulf G. Indahl, Jørgen Ødegård 2017

机译：使用全基因组序列数据和奇异值分解进行基因组选择的变量选择模型
7. On Ecological Fallacy and Assessment Errors Stemming From Misguided Variable Selection: Investigating the Effect of Data Aggregation on the Outcome of Epidemiological Study [O] . Portnov Boris A., Dubnov Jonathan, Barchana Micha 2006

机译：论误导变量选择的生态谬误与评估错误 - 数据聚合对流行病学研究结果的影响

Model Selection and Variable Aggregation of Australian Hospital Data

摘要

著录项

相似文献

相关主题

期刊订阅