Methods for analyzing data from probabilistic linkage strategies based on partially identifying variables

HofM.H.P.; ZwindermanA.H.

首页> 外文期刊>Statistics in medicine >Methods for analyzing data from probabilistic linkage strategies based on partially identifying variables

【24h】

Methods for analyzing data from probabilistic linkage strategies based on partially identifying variables

机译：基于部分识别变量的概率关联策略数据分析方法

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In record linkage studies, unique identifiers are often not available, and therefore, the linkage procedure depends on combinations of partially identifying variables with low discriminating power. As a consequence, wrongly linked covariate and outcome pairs will be created and bias further analysis of the linked data. In this article, we investigated two estimators that correct for linkage error in regression analysis. We extended the estimators developed by Lahiri and Larsen and also suggested a weighted least squares approach to deal with linkage error. We considered both linear and logistic regression problems and evaluated the performance of both methods with simulations. Our results show that all wrong covariate and outcome pairs need to be removed from the analysis in order to calculate unbiased regression coefficients in both approaches. This removal requires strong assumptions on the structure of the data. In addition, the bias significantly increases when the assumptions do not hold and wrongly linked records influence the coefficient estimation. Our simulations showed that both methods had similar performance in linear regression problems. With logistic regression problems, the weighted least squares method showed less bias. Because the specific structure of the data in record linkage problems often leads to different assumptions, itis necessary that the analyst has prior knowledge on the nature of the data. These assumptions are more easily introduced in the weighted least squares approach than in the Lahiri and Larsen estimator.

机译：在记录链接研究中，唯一的标识符通常不可用，因此，链接过程取决于具有较低区分能力的部分标识变量的组合。结果，将创建错误链接的协变量和结果对，并偏向对链接数据的进一步分析。在本文中，我们研究了两个估计量，这些估计量可以校正回归分析中的链接误差。我们扩展了Lahiri和Larsen开发的估计量，并提出了加权最小二乘法来处理链接误差。我们考虑了线性和逻辑回归问题，并通过仿真评估了这两种方法的性能。我们的结果表明，所有错误的协变量和结果对都需要从分析中删除，以便在两种方法中计算无偏回归系数。这种删除要求对数据的结构有很强的假设。此外，当假设不成立且记录错误连接会影响系数估计时，偏差会大大增加。我们的仿真表明，两种方法在线性回归问题上的性能相似。对于逻辑回归问题，加权最小二乘法显示的偏差较小。由于记录链接问题中数据的特定结构通常会导致不同的假设，因此有必要使分析人员具有有关数据性质的先验知识。这些假设在加权最小二乘法中比在Lahiri和Larsen估计器中更容易引入。

著录项

来源
《Statistics in medicine 》 |2012年第30期| 共12页
作者
HofM.H.P.; ZwindermanA.H.;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类卫生调查与统计 ;
关键词
Matching error; Record linkage; Regression analysis;

机译：匹配错误;记录链接;回归分析;

相似文献

外文文献
中文文献
专利

1. Methods for analyzing data from probabilistic linkage strategies based on partially identifying variables [J] . HofM.H.P., ZwindermanA.H. Statistics in medicine . 2012 ,第30期

机译：基于部分识别变量的概率关联策略数据分析方法
2. Accuracy of a probabilistic record linkage strategy applied to identify deaths among cases reported to the Brazilian AIDS surveillance database [J] . Cláudia Medina Coeli, Francisca De Fátima De Araújo Lucena, Maria Goretti Pereira Fonseca, Cadernos de Saúde Pública . 2010 ,第7期

机译：概率记录关联策略的准确性，可用于识别报告给巴西艾滋病监测数据库的病例中的死亡
3. Development of a method for identifying and functionally analyzing allele-specific DNA methylation based on BS-seq data [J] . Zhu Jiang, Su Mu, Gu Yue, Epigenomics . 2019 ,第15期

机译：基于BS-SEQ数据识别和功能分析等位基因特异性DNA甲基化的方法
4. A kinematics velocity reliability analyzing method for complex planar linkage mechanism based on equal-effective mechanics model [C] . Jingyi Liu, Wei Guo, Yugang Zhang, ASME international mechanical engineering congress and exposition . 2017

机译：基于等效力学模型的复杂平面连杆机构运动速度可靠性分析方法
5. Methods for analyzing high dimensional data: Classification, measurement error model and graph based association measures, with applications to microarray data [D] . Ding, Beiying 2004

机译：分析高维数据的方法：分类，测量误差模型和基于图的关联度量，并应用于微阵列数据
6. Using probabilistic record linkage methods to identify Australian Indigenous women on the Queensland Pap Smear Register: the National Indigenous Cervical Screening Project [O] . Lisa J Whop, Abbey Diaz, Peter Baade, 2016

机译：使用概率记录链接方法在昆士兰巴氏涂片涂片上识别澳大利亚土著妇女：国家土著宫颈筛查项目
7. Reclink: aplicativo para o relacionamento de bases de dados, implementando o método probabilistic record linkage Reclink: an application for database linkage implementing the probabilistic record linkage method [O] . Kenneth R. de Camargo Jr., Cláudia M. Coeli 2000

机译：Reclink：用于数据库关系的应用程序，实现概率记录链接方法Reclink：用于数据库链接的应用程序，实现概率记录链接方法

Methods for analyzing data from probabilistic linkage strategies based on partially identifying variables

摘要

著录项

相似文献

相关主题

期刊订阅