Graphical modeling of binary data using the LASSO: a simulation study

Ralf Strobl; Eva Grill; Ulrich Mansmann

首页> 外文期刊>BMC Medical Research Methodology >Graphical modeling of binary data using the LASSO: a simulation study

【24h】

Graphical modeling of binary data using the LASSO: a simulation study

机译：使用LASSO对二进制数据进行图形化建模：仿真研究

获取原文

获取外文期刊封面目录资料

开具论文收录证明 >>

文献代查 >>

文献数据库（团队版） >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Background Graphical models were identified as a promising new approach to modeling high-dimensional clinical data. They provided a probabilistic tool to display, analyze and visualize the net-like dependence structures by drawing a graph describing the conditional dependencies between the variables. Until now, the main focus of research was on building Gaussian graphical models for continuous multivariate data following a multivariate normal distribution. Satisfactory solutions for binary data were missing. We adapted the method of Meinshausen and Bühlmann to binary data and used the LASSO for logistic regression. Objective of this paper was to examine the performance of the Bolasso to the development of graphical models for high dimensional binary data. We hypothesized that the performance of Bolasso is superior to competing LASSO methods to identify graphical models. Methods We analyzed the Bolasso to derive graphical models in comparison with other LASSO based method. Model performance was assessed in a simulation study with random data generated via symmetric local logistic regression models and Gibbs sampling. Main outcome variables were the Structural Hamming Distance and the Youden Index. We applied the results of the simulation study to a real-life data with functioning data of patients having head and neck cancer. Results Bootstrap aggregating as incorporated in the Bolasso algorithm greatly improved the performance in higher sample sizes. The number of bootstraps did have minimal impact on performance. Bolasso performed reasonable well with a cutpoint of 0.90 and a small penalty term. Optimal prediction for Bolasso leads to very conservative models in comparison with AIC, BIC or cross-validated optimal penalty terms. Conclusions Bootstrap aggregating may improve variable selection if the underlying selection process is not too unstable due to small sample size and if one is mainly interested in reducing the false discovery rate. We propose using the Bolasso for graphical modeling in large sample sizes.

机译：背景技术图形模型被认为是一种对高维临床数据建模的有前途的新方法。他们通过绘制描述变量之间条件依存关系的图表，提供了一种概率工具来显示，分析和可视化类似网络的依存结构。到目前为止，研究的主要重点是建立遵循多元正态分布的连续多元数据的高斯图形模型。缺少二进制数据的令人满意的解决方案。我们将Meinshausen和Bühlmann的方法调整为二进制数据，并使用LASSO进行逻辑回归。本文的目的是检验Bolasso在开发高维二进制数据图形模型方面的性能。我们假设Bolasso的性能优于竞争LASSO方法来识别图形模型。方法与其他基于LASSO的方法相比，我们分析了Bolasso以得出图形模型。在模拟研究中，使用通过对称局部逻辑回归模型和Gibbs采样生成的随机数据评估了模型性能。主要结果变量是结构汉明距离和尤登指数。我们将模拟研究的结果应用于具有头颈癌患者功能数据的真实生活数据。结果Bolasso算法中合并了Bootstrap聚合，大大提高了较大样本量的性能。引导程序的数量对性能的影响很小。 Bolasso的表现相当不错，切入点为0.90，罚款期较短。与AIC，BIC或交叉验证的最佳惩罚条款相比，Bolasso的最佳预测导致模型非常保守。结论如果由于样本量较小而基础的选择过程不太不稳定，并且主要对降低错误发现率感兴趣，那么自举聚合可以改善变量选择。我们建议将Bolasso用于大样本量的图形建模。

著录项

来源
《BMC Medical Research Methodology》 |2012年第1期|共页
作者
Ralf Strobl; Eva Grill; Ulrich Mansmann;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类医药、卫生;
关键词

相似文献

外文文献
中文文献
专利

1. Modeling High Dimensional Multilevel Data using the Lasso Estimator: A Simulation Study [J] . W. Holmes Finch Journal of Statistical and Econometric Methods . 2018,第1期

机译：使用套索估计器对高维多层次数据进行建模的模拟研究
2. Regularized Estimation of Piecewise Constant Gaussian Graphical Models: The Group-Fused Graphical Lasso [J] . Gibberd Alexander J., Nelson James D. B. Journal of computational and graphical statistics: A joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America . 2017,第3期

机译：正规化估计分段常数高斯图形模型：群融合图形套索
3. The cluster graphical lasso for improved estimation of Gaussian graphical models [J] . Tan Kean Ming, Witten Daniela, Shojaie Ali Computational statistics & data analysis . 2015,第Null期

机译：集群图形套索，用于改进对高斯图形模型的估计
4. Retrieving Sparser Fuzzy Cognitive Maps Directly from Categorical Ordinal Dataset using the Graphical Lasso Models and the MAX-threshold Algorithm [C] . Zoumpolia Dikopoulou, Elpiniki I. Papageorgiou, Koen Vanhoof IEEE International Conference on Fuzzy Systems . 2020

机译：使用图形套索模型和MAX阈值算法直接从分类有序数据集中检索稀疏模糊认知图
5. Studies of Group Fused Lasso and Probit Model for Right-censored Data [D] . Do, Tuan Quoc. 2020

机译：对右审查数据的组融合套索和概率模型的研究
6. Graphical modeling of binary data using the LASSO: a simulation study [O] . Ralf Strobl, Eva Grill, Ulrich Mansmann 2012

机译：使用LASSO对二进制数据进行图形化建模：仿真研究
7. Graphical modeling of binary data using the LASSO: a simulation study [O] . Ralf Strobl, Eva Grill, Ulrich Mansmann 2012

机译：使用LASSO对二进制数据进行图形化建模：仿真研究

Graphical modeling of binary data using the LASSO: a simulation study

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅