首页> 外文期刊>Journal of Clinical Epidemiology >Using methods from the data-mining and machine-learning literature for disease classification and prediction: A case study examining classification of heart failure subtypes
【24h】

Using methods from the data-mining and machine-learning literature for disease classification and prediction: A case study examining classification of heart failure subtypes

机译:使用数据挖掘和机器学习文献中的方法进行疾病分类和预测:检查心力衰竭亚型分类的案例研究

获取原文
获取原文并翻译 | 示例
           

摘要

Objective: Physicians classify patients into those with or without a specific disease. Furthermore, there is often interest in classifying patients according to disease etiology or subtype. Classification trees are frequently used to classify patients according to the presence or absence of a disease. However, classification trees can suffer from limited accuracy. In the data-mining and machine-learning literature, alternate classification schemes have been developed. These include bootstrap aggregation (bagging), boosting, random forests, and support vector machines. Study Design and Setting: We compared the performance of these classification methods with that of conventional classification trees to classify patients with heart failure (HF) according to the following subtypes: HF with preserved ejection fraction (HFPEF) and HF with reduced ejection fraction. We also compared the ability of these methods to predict the probability of the presence of HFPEF with that of conventional logistic regression. Results: We found that modern, flexible tree-based methods from the data-mining literature offer substantial improvement in prediction and classification of HF subtype compared with conventional classification and regression trees. However, conventional logistic regression had superior performance for predicting the probability of the presence of HFPEF compared with the methods proposed in the data-mining literature. Conclusion: The use of tree-based methods offers superior performance over conventional classification and regression trees for predicting and classifying HF subtypes in a population-based sample of patients from Ontario, Canada. However, these methods do not offer substantial improvements over logistic regression for predicting the presence of HFPEF.
机译:目的:医师将患者分为有或没有特定疾病的患者。此外,经常需要根据疾病的病因或亚型对患者进行分类。分类树经常用于根据疾病的存在与否对患者进行分类。但是,分类树的准确性有限。在数据挖掘和机器学习文献中,已经开发了替代分类方案。这些包括引导聚合(装袋),增强,随机森林和支持向量机。研究设计和设置:我们将这些分类方法的性能与常规分类树的性能进行了比较,以根据以下亚型对心力衰竭(HF)患者进行分类:射血分数保留的HF(HFPEF)和射血分数降低的HF。我们还比较了这些方法预测HFPEF存在的可能性与常规逻辑回归的能力。结果:我们发现,与常规分类树和回归树相比,数据挖掘文献中的现代,灵活的基于树的方法在HF亚型的预测和分类方面提供了显着改进。但是,与数据挖掘文献中提出的方法相比,传统的逻辑回归在预测HFPEF存在的概率方面具有优越的性能。结论:基于树的方法在预测和分类来自加拿大安大略省人群的患者样本中的HF亚型方面,优于常规分类树和回归树。但是,这些方法与用于预测HFPEF的逻辑回归相比并未提供实质性的改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号