...
首页> 外文期刊>International Journal of Population Data Science >Assessing the impact of different grouping methods: time to rethink and regroup?
【24h】

Assessing the impact of different grouping methods: time to rethink and regroup?

机译:评估不同分组方法的影响:是时候重新考虑和重新分组了吗?

获取原文
   

获取外文期刊封面封底 >>

       

摘要

ABSTRACT ObjectivesThe grouping of record-pairs to determine which administrative records belong to the same individual is an important process in record linkage. A variety of grouping methods are used but the relative benefits of each are unknown. We evaluate a number of grouping methods against the traditional merge based clustering approach using large scale administrative data. ApproachThe research aimed to both describe current grouping techniques used for record linkage, and to evaluate the most appropriate grouping method for specific circumstances. A range of grouping strategies were applied to three datasets with known truth sets. Conditions were simulated to appropriately investigate one-to-one, many-to-one and ongoing linkage scenarios. ResultsResults suggest alternate grouping methods will yield large benefits in linkage quality, especially when the quality of the underlying repository is high. Stepwise grouping methods were clearly superior for one-to-one linkage. There appeared little difference in linkage quality between many-to-one grouping approaches. The most appropriate techniques for ongoing linkage depended on the quality of the population spine and the underlying dataset. ConclusionsThese results demonstrate the large effect that the choice of grouping strategy can have on overall linkage quality. Ongoing linkages to high quality population spines provide large improvements in linkage quality compared to merge based linkages. Procuring or developing such a population spine will provide high linkage quality at far lower cost than current methods for improving linkage quality. By improving linkage quality at low cost, this resource can be further utilised by health researchers.
机译:摘要目标对记录对进行分组以确定哪个行政记录属于同一个人是记录链接中的重要过程。使用了多种分组方法,但是每种方法的相对好处尚不清楚。针对使用大型管理数据的传统基于合并的聚类方法,我们评估了许多分组方法。方法该研究旨在描述当前用于记录链接的分组技术,并评估针对特定情况的最合适的分组方法。将一系列分组策略应用于具有已知真值集的三个数据集。模拟条件以适当地调查一对一,多对一和正在进行的链接方案。结果结果表明,替代的分组方法将在链接质量上产生很大的好处,尤其是在基础存储库的质量很高时。对于一对一链接,分步分组方法显然更好。多对一分组方法之间的链接质量几乎没有差异。进行持续链接的最合适技术取决于人口脊柱和基础数据集的质量。结论这些结果证明了分组策略的选择对整体连锁质量的巨大影响。与基于合并的链接相比,正在进行的与高质量种群刺的链接大大提高了链接质量。采购或开发这样的种群主干将以比目前用于改善链接质量的方法低得多的成本提供高链接质量。通过以低成本改善链接质量,健康研究人员可以进一步利用此资源。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号