首页> 外文会议>Annual International Conference of the IEEE Engineering in Medicine and Biology Society >Using machine learning models to classify stroke risk level based on national screening data *
【24h】

Using machine learning models to classify stroke risk level based on national screening data *

机译:使用机器学习模型基于国家筛查数据 * 对中风风险等级进行分类

获取原文

摘要

With the character of high incidence, high prevalence and high mortality, stroke has brought a heavy burden to families and society in China. In 2009, the Ministry of Health of China launched the China national stroke screening and intervention program, which screens stroke risk factors and conducts high-risk population interventions for people aged over 40 years old all over China. In this program, stroke risk factors include hypertension, diabetes, dyslipidemia, atrial fibrillation, smoking, lack of exercise, apparently overweight or obese and family history of stroke. People with more than two risk factors or with a history of stroke or transient ischemic attack (TIA) are considered as high-risk. However, it is impossible for this criterion to classify stroke risk level for people with "unknown" values in the fields of risk factors. The missing of stroke risk levels results in reduced efficiency of stroke interventions and inaccuracies in the statistical results at the national level. In this paper, firstly, we construct the training set and test set and process the imbalanced training set based on oversampling and undersampling method. Then, we develop logistic regression model, decision tree model, neural network model and random forest model for stroke risk classification, and evaluate these models based on the recall and precision. The results show that the model based on random forest achieves best performance considering recall and precision. The models constructed in this paper can improve the screening efficiency and avoid unnecessary rescreening and intervention expenditures.
机译:中风具有高发病率,高患病率和高死亡率的特点,给中国家庭和社会带来沉重负担。 2009年,中国卫生部启动了“中国中风筛查和干预计划”,该计划旨在筛查中风的危险因素,并对全国40岁以上的人群进行高危人群干预。在该计划中,中风的危险因素包括高血压,糖尿病,血脂异常,房颤,吸烟,缺乏运动,明显超重或肥胖以及中风家族史。具有两个以上危险因素或具有中风或短暂性脑缺血发作(TIA)历史的人被视为高危人群。但是,对于在风险因素领域中具有“未知”值的人,此标准不可能对中风风险水平进行分类。中风风险水平的缺失导致中风干预措施的效率降低,并且国家一级的统计结果不准确。本文首先建立训练集和测试集,并基于过采样和欠采样方法处理不平衡训练集。然后,我们开发了用于中风风险分类的逻辑回归模型,决策树模型,神经网络模型和随机森林模型,并基于召回率和精度对这些模型进行了评估。结果表明,基于召回率和精度,基于随机森林的模型取得了最佳性能。本文构建的模型可以提高筛查效率,避免不必要的重新筛查和干预支出。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号