首页> 外文OA文献 >Inconsistency in the use of the term “validation” in studies reporting the performance of deep learning algorithms in providing diagnosis from medical imaging
【2h】

Inconsistency in the use of the term “validation” in studies reporting the performance of deep learning algorithms in providing diagnosis from medical imaging

机译:在研究中使用术语“验证”术语的不一致报告了深度学习算法在提供医学成像的诊断方面的性能

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

BackgroundThe development of deep learning (DL) algorithms is a three-step process-training, tuning, and testing. Studies are inconsistent in the use of the term "validation", with some using it to refer to tuning and others testing, which hinders accurate delivery of information and may inadvertently exaggerate the performance of DL algorithms. We investigated the extent of inconsistency in usage of the term "validation" in studies on the accuracy of DL algorithms in providing diagnosis from medical imaging.Methods and findingsWe analyzed the full texts of research papers cited in two recent systematic reviews. The papers were categorized according to whether the term "validation" was used to refer to tuning alone, both tuning and testing, or testing alone. We analyzed whether paper characteristics (i.e., journal category, field of study, year of print publication, journal impact factor [JIF], and nature of test data) were associated with the usage of the terminology using multivariable logistic regression analysis with generalized estimating equations. Of 201 papers published in 125 journals, 118 (58.7%), 9 (4.5%), and 74 (36.8%) used the term to refer to tuning alone, both tuning and testing, and testing alone, respectively. A weak association was noted between higher JIF and using the term to refer to testing (i.e., testing alone or both tuning and testing) instead of tuning alone (vs. JIF 10: adjusted odds ratio 2.41, P = 0.089). Journal category, field of study, year of print publication, and nature of test data were not significantly associated with the terminology usage.ConclusionsExisting literature has a significant degree of inconsistency in using the term "validation" when referring to the steps in DL algorithm development. Efforts are needed to improve the accuracy and clarity in the terminology usage.
机译:深学习(DL)算法背景:发展是一个三步流程的培训,调试和测试。研究是在使用术语“验证”的不一致,有些用它来指调整和其他测试,这阻碍了信息的准确传递和可能会无意中夸大的DL算法的性能。我们研究了在对DL算法的准确性研究术语“验证”的用法不一致的范围从医疗imaging.Methods提供诊断和分析findingsWe研究论文的全文在最近的两个系统评价引用。这些论文按“确认”一词是否被用来指单独调节,无论是调整和测试,或单独测试分类。我们分析是否纸张特性(即,日志分类,研究领域,印刷出版年,期刊影响因子JIF],和测试数据的性质)与术语的使用多因素Logistic回归分析与广义估计方程的使用相关的。发表在期刊125,118(58.7%),9(4.5%),和74(36.8%)201篇论文中使用的术语来指代单独调谐,调谐两者和测试,并分别单独测试。弱相关性更高JIF之间注意到使用术语来指代测试(即,测试单独或两者调整和测试),而不是单独调谐(与JIF 10:校正比值比2.41,P = 0.089)。日志分类,研究领域,印刷出版年份,和测试数据的性质并没有显著用的术语usage.ConclusionsExisting文学相关的参照DL算法开发的步骤时,有不一致的显著程度使用术语“验证” 。需要努力改善的术语用法准确度和清晰度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号