...
首页> 外文期刊>Journal of Biomedical Semantics >Semantic enrichment of longitudinal clinical study data using the CDISC standards and the semantic statistics vocabularies
【24h】

Semantic enrichment of longitudinal clinical study data using the CDISC standards and the semantic statistics vocabularies

机译:使用CDISC标准和语义统计词汇对纵向临床研究数据进行语义丰富

获取原文
           

摘要

Background There is an increasing recognition of the need for the data capture phase of clinical studies to be improved and for more effective sharing of clinical data. The Health Care and Life Sciences community has embraced semantic technologies to facilitate the integration of health data from electronic health records, clinical studies and pharmaceutical research. This paper explores the integration of clinical study data exchange standards and semantic statistic vocabularies to deliver clinical data as linked data in a format that is easier to enrich with links to complementary data sources and consume by a broad user base. Methods We propose a Linked Clinical Data Cube (LCDC), which combines the strength of the RDF Data Cube and DDI-RDF vocabulary to enrich clinical data based on the CDISC standards. The CDISC standards provide the mechanisms for the data to be standardised, made more accessible and accountable whereas the RDF Data Cube and DDI-RDF vocabularies provide novel approaches to managing large volumes of heterogeneous linked data resources. Results We validate our approach using a large-scale longitudinal clinical study into neurodegenerative diseases. This dataset, comprising more than 1600 variables clustered in 25 different sub-domains, has been fully converted into RDF forming one main data cube and one specialised cube for each sub-domain. One sub-domain, the Medications specialised cube, has been linked to relevant external vocabularies, such as the Australian Medicines Terminology and the ATC DDD taxonomy and DrugBank terminology. This provides new dimensions on which to query the data that promote the exploration of drug-drug and drug-disease interactions. Conclusions This implementation highlights the effectiveness of the association of the semantic statistics vocabularies for the publication of large heterogeneous data sets as linked data and the integration of the semantic statistics vocabularies with the CDISC standards. In particular, it demonstrates the potential of the two vocabularies in overcoming the monolithic nature of the underlying model and improving the navigation and querying of the data from multiple angles to support richer data analysis of clinical study data. The forecasted benefits are more efficient use of clinicians’ time and the potential to facilitate cross-study analysis.
机译:背景技术人们日益认识到需要改进临床研究的数据捕获阶段以及更有效地共享临床数据。卫生保健和生命科学界已采用语义技术来促进电子健康记录,临床研究和药物研究中健康数据的集成。本文探讨了临床研究数据交换标准和语义统计词汇表的集成,以更易于通过补充互补数据源的链接丰富并被广泛的用户群使用的格式,将临床数据作为链接数据提供。方法我们提出了一个链接的临床数据立方体(LCDC),该模型结合了RDF数据立方体和DDI-RDF词汇的优势来丰富基于CDISC标准的临床数据。 CDISC标准提供了使数据标准化,更易于访问和负责的机制,而RDF数据多维数据集和DDI-RDF词汇表提供了管理大量异构链接数据资源的新颖方法。结果我们使用针对神经退行性疾病的大规模纵向临床研究验证了我们的方法。该数据集包含1600多个变量,这些变量聚集在25个不同的子域中,已完全转换为RDF,从而为每个子域形成一个主数据多维数据集和一个专用多维数据集。一个子域,即药物专用立方体,已与相关的外部词汇相关联,例如澳大利亚医学术语,ATC DDD分类法和DrugBank术语。这为查询数据提供了新的维度,这些数据促进了对药物和药物-疾病相互作用的探索。结论此实现突出显示了语义统计词汇表的关联对于将大型异构数据集发布为链接数据以及将语义统计词汇表与CDISC标准集成的有效性。特别是,它展示了两个词汇表在克服基础模型的整体性以及从多个角度改善数据导航和查询以支持临床研究数据的更丰富数据分析方面的潜力。预计的好处是可以更有效地利用临床医生的时间,并有可能促进跨研究分析。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号