首页> 外文期刊>International Journal of Population Data Science >Methods for enhancing the reproducibility of clinical epidemiology research in linked electronic health records: results and lessons learned from the CALIBER platform
【24h】

Methods for enhancing the reproducibility of clinical epidemiology research in linked electronic health records: results and lessons learned from the CALIBER platform

机译:增强链接电子健康记录中临床流行病学研究可重复性的方法:从CALIBER平台获得的结果和经验教训

获取原文
           

摘要

ABSTRACTObjectivesElectronic health records (EHR) across primary, secondary, and tertiary care are increasingly being linked for research at a population level. The increasing volume, variety, velocity, and veracity of big biomedical data makes research reproducibility challenging. Research reproducibility and replicability is essential for the external validity and generalizability of scientific findings and the lack of standardized approaches and tools and relative opaqueness of data manipulation methods is detrimental to their integrity. The objective of this study was to explore, evaluate and propose methods, tools and approaches for addressing some of the challenges associated with reproducibility when using linked national electronic health records for research. ApproachWe systematically searched literature and internet resources for well-established and appropriate methods, tools, and approaches used in related scientific disciplines. The identified techniques were systematically evaluated in terms of their capacity to facilitate reproducible research in routinely collected health data across the life course of a research project: from protocol creation and raw data curation to data transformation and statistical analysis though to finding dissemination and impact. Most importantly, the identified techniques were tested and applied in a contemporary database of linked electronic health records. CALIBER is a research data platform of linked national electronic health records from primary care (Clinical Practice Research Datalink), secondary care (Hospital Episode Statistics), acute coronary syndrome disease registry (Myocardial Ischaemia National Audit Project) and cause-specific mortality (Office for National Statistics) for roughly 2 million adults. ResultsFirstly, we present the review of methods and approaches which we identified through our search. Secondly, we propose a set of recommendations for applying them within the context of research projects making use of linked routinely collected health data. Focal interests included: a) documentation of data (attributes, relationships, and interpretation), b) data processing (source code, instructions, and parameters), c) results (visualizations, figures), and any supplementary material. Thirdly, we present approaches around a) raw data curation using international metadata standards, b) study protocol encoding, c) provenance and sharing of data transformation and statistical analysis operations, d) public and private data retention, and e) computable EHR-driven phenotypes. ConclusionThe complexity and size of routinely collected health data is increasing through linkages across distributed data sources. The scientific community benefits from findings which can be replicated. This study presents a number of methods, tools and approaches across the project life course for ensuring that their research studies are reproducible and replicable from the wider scientific community.
机译:摘要跨初级,二级和三级护理的电子健康记录(EHR)越来越多地与人口水平的研究联系在一起。大的生物医学数据的数量,种类,速度和准确性不断增加,使得研究的可重复性面临挑战。研究的可重复性和可复制性对于科学发现的外部有效性和普遍性至关重要,缺乏标准化的方法和工具,数据操作方法相对不透明,不利于其完整性。这项研究的目的是探索,评估和提出方法,工具和方法,以解决使用链接的国家电子健康记录进行研究时与可再现性相关的一些挑战。方法我们系统地搜索了文献和互联网资源,以找到相关科学学科中使用的完善且适当的方法,工具和方法。在研究项目的整个生命周期中,系统地评估了已鉴定的技术的能力,以促进对常规收集的健康数据进行可重复的研究:从方案创建,原始数据管理到数据转换和统计分析,直至发现传播和影响。最重要的是,对确定的技术进行了测试,并将其应用于链接的电子健康记录的现代数据库中。 CALIBER是一个研究数据平台,它链接了来自初级保健(Clinical Practice Research Datalink),二级保健(Hospital Episode Statistics),急性冠状动脉综合征疾病注册表(Myocardial Ischaemia National Audit Project)和特定病因死亡率(Office for国家统计资料),约有200万成年人。结果首先,我们介绍了通过搜索确定的方法和途径。其次,我们提出了一系列建议,以利用定期收集的健康数据链接在研究项目中应用这些建议。主要兴趣包括:a)数据文档(属性,关系和解释),b)数据处理(源代码,指令和参数),c)结果(可视化,图形)和任何补充材料。第三,我们提出以下方法:a)使用国际元数据标准进行原始数据管理,b)研究协议编码,c)数据转换和统计分析操作的出处和共享,d)公共和私有数据保留,以及e)可计算的EHR驱动表型。结论通过跨分布式数据源的链接,常规收集的健康数据的复杂性和规模正在增加。科学界受益于可以复制的发现。这项研究提出了整个项目生命过程中的许多方法,工具和方法,以确保他们的研究研究可被更广泛的科学界复制和复制。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号