首页> 外文期刊>SIGKDD explorations >A Permutation Approach to Assess Confounding in Machine Learning Applications for Digital Health
【24h】

A Permutation Approach to Assess Confounding in Machine Learning Applications for Digital Health

机译:评估数字健康机器学习应用中混淆的排列方法

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Machine learning applications are often plagued with confounders that can impact the generalizability of the learners. In clinical settings, demographic characteristics often play the role of con- founders. Confounding is especially problematic in remote digital health studies where the participants self-select to enter the study, thereby making it difficult to balance the demographic characteristics of participants. One effective approach to combat confounding is to match samples with respect to the confounding variables in order to improve the balance of the data. This procedure, however, leads to smaller datasets and hence negatively impact the inferences drawn from the learners. Alternatively, confounding adjustment methods that make more efficient use of the data (such as inverse probability weighting) usually rely on modeling assumptions, and it is unclear how robust these methods are to violations of these assumptions. Here, instead of proposing a new method to control for confounding, we develop novel permutation based statistical tools to detect and quantify the influence of observed confounders, and estimate the unconfounded performance of the learner. Our tools can be used to evaluate the effectiveness of existing confounding adjustment methods. We evaluate the statistical properties of our methods in a simulation study, and illustrate their application using real-life data from a Parkinson's disease mobile health study collected in an uncontrolled environment.
机译:机器学习应用程序往往困扰着混淆,可以影响学习者的普遍性。在临床环境中,人口特征经常发挥概念的作用。混淆在远程数字健康研究中特别有问题,参与者自我选择进入研究,从而难以平衡参与者的人口特征。打击混淆的一种有效方法是将样本相对于混淆变量匹配,以改善数据的平衡。然而,该过程导致较小的数据集,因此对来自学习者汲取的推断产生负面影响。或者,混淆调整方法,使得更有效地使用数据(例如反概率加权)通常依赖于建模假设,并且目前尚不清楚这些方法对这些假设的侵犯有效程度。这里,不是提出一种控制混淆的新方法,我们开发了基于新的基于置换的统计工具,以检测和量化观察到的混淆的影响,并估计学习者的不协调性能。我们的工具可用于评估现有混淆调整方法的有效性。我们评估我们在模拟研究中的方法的统计特性,并使用来自在不受控制的环境中收集的帕金森病的疾病移动健康研究中的现实生活数据来说明他们的应用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号