首页> 外文期刊>Computers in Biology and Medicine >Medical data quality assessment: On the development of an automated framework for medical data curation
【24h】

Medical data quality assessment: On the development of an automated framework for medical data curation

机译:医疗数据质量评估:关于医疗数据策择自动化框架的开发

获取原文
获取原文并翻译 | 示例
           

摘要

Data quality assessment has gained attention in the recent years since more and more companies and medical centers are highlighting the importance of an automated framework to effectively manage the quality of their big data. Data cleaning, also known as data curation, lies in the heart of the data quality assessment and is a key aspect prior to the development of any data analytics services. In this work, we present the objectives, functionalities and methodological advances of an automated framework for data curation from a medical perspective. The steps towards the development of a system for data quality assessment are first described along with multidisciplinary data quality measures. A three-layer architecture which realizes these steps is then presented. Emphasis is given on the detection and tracking of inconsistencies, missing values, outliers, and similarities, as well as, on data standardization to finally enable data harmonization. A case study is conducted in order to demonstrate the applicability and reliability of the proposed framework on two well-established cohorts with clinical data related to the primary Sjogren's Syndrome (pSS). Our results confirm the validity of the proposed framework towards the automated and fast identification of outliers, inconsistencies, and highly-correlated and duplicated terms, as well as, the successful matching of more than 85% of the pSS-related medical terms in both cohorts, yielding more accurate, relevant, and consistent clinical data.
机译:数据质量评估在近年来越来越多的公司和医疗中心突出了自动框架的重要性,以有效地管理其大数据质量的重要性。数据清洁,也称为数据策展,位于数据质量评估的核心中,是在开发任何数据分析服务之前的关键方面。在这项工作中,我们展示了医学视角的数据策委自动化框架的目标,功能和方法研究。首先使用多学科数据质量措施来描述迈向数据质量评估系统的步骤。然后呈现三层架构,其实现了这些步骤。重点是关于检测和跟踪不一致,缺少值,异常值和相似性,以及数据标准化最终启用数据协调。进行案例研究,以展示两个建立框架的适用性和可靠性,与初级Sjogren综合征(PSS)相关的临床数据。我们的结果证实了拟议的框架朝向异常值,不一致和高度相关性和重复的术语的自动化和快速识别的有效性,以及在两个群组中成功匹配超过85%的PSS相关的医学术语,产生更准确,相关和一致的临床数据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号