首页> 外文期刊>JCO clinical cancer informatics. >Machine Learning Methods to Identify Missed Cases of Bladder Cancer in Population-Based Registries
【24h】

Machine Learning Methods to Identify Missed Cases of Bladder Cancer in Population-Based Registries

机译:机器学习方法以识别基于人群的注册表中错过的膀胱癌病例

获取原文
获取原文并翻译 | 示例
           

摘要

PURPOSE Population-based cancer incidence rates of bladder cancer may be underestimated. Accurate estimates are needed for understanding the burden of bladder cancer in the United States. We developed and evaluated the feasibility of a machine learning-based classifier to identify bladder cancer cases missed by cancer registries, and estimated the rate of bladder cancer cases potentially missed.METHODS Data were from population-based cohort of 37,940 bladder cancer cases 65 years of age and older in the SEER cancer registries linked with Medicare claims (2007-2013). Cases with other urologic cancers, abdominal cancers, and unrelated cancers were included as control groups. A cohort of cancer-free controls was also selected using the Medicare 5% random sample. We used five supervised machine learning methods: classification and regression trees, random forest, logic regression, support vector machines, and logistic regression, for predicting bladder cancer.RESULTS Registry linkages yielded 37,940 bladder cancer cases and 766,303 cancer-free controls. Using health insurance claims, classification and regression trees distinguished bladder cancer cases from noncancer controls with very high accuracy (95%). Bacille Calmette-Guerin, cystectomy, and mitomycin were the most important predictors for identifying bladder cancer. From 2007 to 2013, we estimated that up to 3,300 bladder cancer cases in the United States may have been missed by the SEER registries. This would result in an average of 3.5% increase in the reported incidence rate.CONCLUSION SEER cancer registries may potentially miss bladder cancer cases during routine reporting. These missed cases can be identified leveraging Medicare claims and data analytics, leading to more accurate estimates of bladder cancer incidence.
机译:膀胱癌的基于目的的癌症发病率可能被低估。需要准确的估计来了解美国膀胱癌的负担。我们开发并评估了基于机器学习的分类器的可行性,以识别癌症登记局错过的膀胱癌病例,并估计可能错过的膀胱癌病例的发生率。方法来自基于人群的同类群体37,940膀胱癌病例65年的3740例与Medicare索赔有关的SEER癌症注册处的年龄及以上(2007- 2013年)。包括其他泌尿科癌症,腹部癌症和无关癌症的病例作为对照组。还使用Medicare 5%随机样品选择了无癌对照组。我们使用了五种监督的机器学习方法:分类和回归树,随机森林,逻辑回归,支持矢量机和逻辑回归,用于预测膀胱癌。分析注册表链接产生了37,940例膀胱癌病例和766,303303例无癌症对照。使用健康保险索赔,分类和回归树,将膀胱癌病例与非癌症控制区分开(95%)。巴奇·塞莱特 - 海格因,膀胱切除术和丝裂霉素是鉴定膀胱癌的最重要预测因子。从2007年到2013年,我们估计,在美国,在美国可能会错过多达3,300例膀胱癌病例。这将导致报告的发病率平均增加3.5%。结论性癌症登记处可能会在常规报告期间可能错过膀胱癌病例。可以确定这些错过的病例,以利用医疗保险的索赔和数据分析,从而更准确地估计了膀胱癌发病率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号