首页> 美国卫生研究院文献>PLoS Clinical Trials >The Univariate Flagging Algorithm (UFA): An interpretable approach for predictive modeling
【2h】

The Univariate Flagging Algorithm (UFA): An interpretable approach for predictive modeling

机译:单变量标记算法(UFA):一种可解释的预测建模方法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

In many data classification problems, a number of methods will give similar accuracy. However, when working with people who are not experts in data science such as doctors, lawyers, and judges among others, finding interpretable algorithms can be a critical success factor. Practitioners have a deep understanding of the individual input variables but far less insight into how they interact with each other. For example, there may be ranges of an input variable for which the observed outcome is significantly more or less likely. This paper describes an algorithm for automatic detection of such thresholds, called the Univariate Flagging Algorithm (UFA). The algorithm searches for a separation that optimizes the difference between separated areas while obtaining a high level of support. We evaluate its performance using six sample datasets and demonstrate that thresholds identified by the algorithm align well with published results and known physiological boundaries. We also introduce two classification approaches that use UFA and show that the performance attained on unseen test data is comparable to or better than traditional classifiers when confidence intervals are considered. We identify conditions under which UFA performs well, including applications with large amounts of missing or noisy data, applications with a large number of inputs relative to observations, and applications where incidence of the target is low. We argue that ease of explanation of the results, robustness to missing data and noise, and detection of low incidence adverse outcomes are desirable features for clinical applications that can be achieved with relatively simple classifier, like UFA.
机译:在许多数据分类问题中,许多方法将提供相似的准确性。但是,当与不是数据科学专家的人(例如医生,律师和法官)合作时,找到可解释的算法可能是成功的关键因素。从业者对各个输入变量有深刻的理解,但对它们如何相互作用的了解却很少。例如,可能存在输入变量的范围,对于该范围,观察到的结果或多或少会明显地具有可能性。本文介绍了一种用于自动检测此类阈值的算法,称为单变量标记算法(UFA)。该算法搜索一种分离,该分离可优化分离区域之间的差异,同时获得高水平的支持。我们使用六个样本数据集评估其性能,并证明该算法确定的阈值与已发布的结果以及已知的生理边界很好地吻合。我们还介绍了两种使用UFA的分类方法,它们表明在考虑置信区间时,在看不见的测试数据上获得的性能与传统分类器相当或更好。我们确定UFA表现良好的条件,包括缺少大量或嘈杂数据的应用程序,相对于观测值有大量输入的应用程序以及目标发生率较低的应用程序。我们认为,对于结果的易于解释,对丢失数据和噪声的鲁棒性以及对低发生率不良结果的检测是临床应用的理想功能,可以通过相对简单的分类器(例如UFA)来实现。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号