【24h】

Linkage-data linear regression

机译:链接数据线性回归

获取原文
获取原文并翻译 | 示例
       

摘要

Data linkage is increasingly being used to combine data from different sources with the aim of identifying and bringing together records from separate files, which correspond to the same entities. Usually, data linkage is not a trivial procedure and linkage errors, false and missed links, are unavoidable. In these cases, standard statistical techniques may produce misleading inference. In this paper, we propose a method for secondary linear regression analysis, where the linked data have to be prepared by someone else, and neither the match-key variables nor the unlinked records are available to the analyst. We develop also a diagnostic test for the assumption of non-informative linkage errors, which is required for all existing secondary analysis adjustment methods. Our approach provides important advantages: it relies on the realistic assumption that the probabilities of correct linkage vary across the records but it does not assume that one is able to estimate the probability of correct linkage for each individual record. Moreover, it accommodates in a simple manner the general situation where the files are of different sizes and none of them is a subset of another. The proposed methodology of adjustment and testing is studied by simulation and applied to real data.
机译:数据链接越来越多地用于将来自不同来源的数据组合起来的目的是从单独的文件中识别和带一起记录,该文件对应于同一实体。通常,数据链接不是琐碎的过程和链接错误,假和错过的链接是不可避免的。在这些情况下,标准统计技术可能产生误导推断。在本文中,我们提出了一种用于次级线性回归分析的方法,其中链接数据必须由其他人准备,并且匹配关键变量和分析师都没有可解释的记录。我们还开发了一个诊断测试,以假设所有现有的次要分析调整方法都是必需的。我们的方法提供了重要的优势:它依赖于现实假设,即正确的联动概率在记录中各不相同,但它不认为一个人能够估计每个单独记录的正确联动的概率。此外,它以简单的方式容纳文件的一般情况不同大小,并且它们都不是另一个的子集。通过模拟研究了调整和测试的提出方法,并应用于真实数据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号