首页> 外文期刊>MIS quarterly >A TREE-BASED APPROACH FOR ADDRESSING SELF-SELECTION IN IMPACT STUDIES WITH BIG DATA
【24h】

A TREE-BASED APPROACH FOR ADDRESSING SELF-SELECTION IN IMPACT STUDIES WITH BIG DATA

机译:大数据影响研究中基于树的自选择方法

获取原文
获取原文并翻译 | 示例
       

摘要

In this paper, we introduce a tree-based approach adjusting for observable self-selection bias in intervention studies in management research. In contrast to traditional propensity score (PS) matching methods, including those using classification trees as a subcomponent, our tree-based approach provides a standalone, automated, data-driven methodology that allows for (1) the examination of nascent interventions whose selection is difficult and costly to theoretically specify a priori, (2) detection of heterogeneous intervention effects for different pre-intervention profiles, (3) identification of pre-intervention variables that correlate with the self-selected intervention, and (4) visual presentation of intervention effects that is easy to discern and understand. As such, the tree-based approach is a useful tool for analyzing observational impact studies as well as for post-analysis of experimental data. The tree-based approach is particularly advantageous in the analyses of big data, or data with large sample sizes and a large number of variables. It outperforms PS in terms of computational time, data loss, and automatic capture of nonlinear relationships and heterogeneous interventions. It also requires less user specification and choices than PS, reducing potential data dredging. We discuss the performance of our method in the context of such big data and present results for very large simulated samples with many variables. We illustrate the method and the insights it yields in the context of three impact studies with different study designs: reanalysis of a field study on the effect of training on earnings, analysis of the impact of an electronic governance service in India based on a quasi-experiment, and performance comparison of contract pricing mechanisms and durations in IT outsourcing using observational data.
机译:在本文中,我们介绍了一种基于树的方法,可对管理研究中的干预研究中的可观察到的自我选择偏差进行调整。与传统的倾向评分(PS)匹配方法(包括使用分类树作为子组件的方法)相比,我们的基于树的方法提供了一种独立的,自动化的,数据驱动的方法,该方法可用于(1)检查其选择为从理论上指定先验条件困难且昂贵;(2)针对不同的干预前概况检测异类干预效果;(3)识别与自选干预相关的干预前变量;以及(4)干预的可视化表示易于识别和理解的效果。因此,基于树的方法是分析观察影响研究以及对实验数据进行后分析的有用工具。基于树的方法在分析大数据或具有大样本量和大量变量的数据时特别有利。在计算时间,数据丢失以及非线性关系和异构干预的自动捕获方面,它的性能优于PS。与PS相比,它还需要更少的用户规范和选择,从而减少了潜在的数据挖掘。我们讨论了在如此大数据的情况下我们方法的性能,并给出了具有许多变量的超大型模拟样本的结果。我们将在三项具有不同研究设计的影响研究的背景下,说明该方法及其产生的见解:对培训对收益的影响进行实地研究的再分析,基于准管理的印度电子治理服务的影响分析。实验和使用观察性数据比较IT外包中合同定价机制和期限的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号