首页> 外文会议>International Conference on Intelligent Computing and Control Systems >Data Mining for Early Gastric Cancer Etiological Factors from Diet-Lifestyle Characteristics
【24h】

Data Mining for Early Gastric Cancer Etiological Factors from Diet-Lifestyle Characteristics

机译:饮食生活方式特征对早期胃癌病因的数据挖掘

获取原文

摘要

Gastric cancer is predominantly caused by demographic-diet factors as compared to other cancer types. The aim of the study is to predict Early Gastric Cancer (EGC) factors from diet and lifestyle characteristics of Mizo-ethnicity using supervised machine learning algorithms. For this study, 80 cases and 160 controls are selected and a dataset containing 11 features that are core risk factors for the gastric cancer have been chosen for data mining. The learning curves show Naive Bayes, Logistic Regression and Multilayer perceptron are the best fit classification algorithms for our dataset. Data models are constructed and evaluated using: brier score, accuracy, precision_recall curves for cases (patients) and controls (healthy individuals), and false positives. The data interpretation shows Naive Bayes has the highest classification results having an accuracy of 90%, with the lowest Brier score of 0.1, and a false positive rate of 3% as compared to other models. Logistic regression classifier shows equally good performances with setback in brier_score and false positives. This study found that extra salt, tuibur, smoking and alcohol are the non_invasive etiological factors for gastric cancer in Mizoram population as predicted by the Naive Bayes algorithm. This knowledge will be helpful for initiating early screening and to educate the public about the risk of dietary and lifestyle factors in high risk population with unique habits.
机译:与其他癌症类型相比,胃癌主要由人口饮食因素引起。该研究的目的是使用有监督的机器学习算法,通过饮食和Mizo族裔的生活方式特征来预测早期胃癌(EGC)因素。在本研究中,选择了80个病例和160个对照,并选择了包含11个特征(这些特征是胃癌的核心危险因素)的数据集进行数据挖掘。学习曲线表明,朴素贝叶斯,逻辑回归和多层感知器是我们数据集的最佳拟合分类算法。使用以下各项构建和评估数据模型:分数(brier score),准确性,病例(患者)和对照组(健康个体)的precision_recall曲线以及误报率。数据解释表明,与其他模型相比,朴素贝叶斯具有最高的分类结果,准确性为90%,最低的Brier得分为0.1,假阳性率为3%。逻辑回归分类器在brier_score和误报方面表现出同样好的表现。这项研究发现,朴素贝叶斯算法(Naive Bayes algorithm)预测,多余的盐分,烟酒,吸烟和酒精是米佐拉姆族人群胃癌的非侵入性病因。这些知识将有助于开展早期筛查,并就具有独特习惯的高风险人群的饮食和生活方式因素的风险教育公众。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号