首页> 外文期刊>Asian Pacific Journal of Cancer Prevention >Analyzing a Lung Cancer Patient Dataset with the Focus on Predicting Survival Rate One Year after Thoracic Surgery
【24h】

Analyzing a Lung Cancer Patient Dataset with the Focus on Predicting Survival Rate One Year after Thoracic Surgery

机译:分析肺癌患者数据集,重点是预测胸腔手术后一年的生存率

获取原文
           

摘要

Background: Data mining, a new concept introduced in the mid-1990s, can help researchers to gain new, profound insights and facilitate access to unanticipated knowledge sources in biomedical datasets. Many issues in the medical field are concerned with the diagnosis of diseases based on tests conducted on individuals at risk. Early diagnosis and treatment can provide a better outcome regarding the survival of lung cancer patients. Researchers can use data mining techniques to create effective diagnostic models. The aim of this study was to evaluate patterns existing in risk factor data of for mortality one year after thoracic surgery for lung cancer. Methods: The dataset used in this study contained 470 records and 17 features. First, the most important variables involved in the incidence of lung cancer were extracted using knowledge discovery and datamining algorithms such as naive Bayes, maximum expectation and then, using a regression analysis algorithm, a questionnaire was developed to predict the risk of death one year after lung surgery. Outliers in the data were excluded and reported using the clustering algorithm. Finally, a calculator was designed to estimate the risk for one-year post-operative mortality based on a scorecard algorithm. Results: The results revealed the most important factor involved in increased mortality to be large tumor size. Roles for type II diabetes and preoperative dyspnea in lower survival were also identified. The greatest commonality in classification of patients was Forced expiratory volume in first second (FEV1), based on levels of which patients could be classified into different categories. Conclusion: Development of a questionnaire based on calculations to diagnose disease can be used to identify and fill knowledge gaps in clinical practice guidelines.
机译:背景:数据挖掘是1990年代中期引入的一个新概念,可以帮助研究人员获得新的深刻见解,并促进对生物医学数据集中意外知识的获取。医学领域的许多问题都与基于对有风险的个体进行的测试有关的疾病诊断有关。早期诊断和治疗可以为肺癌患者的生存提供更好的结果。研究人员可以使用数据挖掘技术来创建有效的诊断模型。这项研究的目的是评估在肺癌胸腔手术后一年的死亡危险因素数据中存在的模式。方法:本研究中使用的数据集包含470条记录和17个特征。首先,使用知识发现和数据挖掘算法(如朴素贝叶斯,最大期望值)提取肺癌发生率中最重要的变量,然后使用回归分析算法编制问卷以预测一年后的死亡风险肺部手术。排除数据中的异常值,并使用聚类算法报告。最后,设计了一个计算器,根据计分卡算法估算术后一年死亡的风险。结果:结果显示,与肿瘤增大有关的最重要因素是肿瘤大。还确定了II型糖尿病和术前呼吸困难在较低生存率中的作用。在患者分类中,最大的共性是第一秒钟的强制呼气量(FEV1),根据该水平可以将患者分为不同的类别。结论:基于计算结果来诊断疾病的调查表可用于识别和填补临床实践指南中的知识空白。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号