首页> 外文OA文献 >SIFR annotator: ontology-based semantic annotation of French biomedical text and clinical notes
【2h】

SIFR annotator: ontology-based semantic annotation of French biomedical text and clinical notes

机译:SIFR注释器:法国生物医学文本和临床笔记的本体语义诠释

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Abstract Background Despite a wide adoption of English in science, a significant amount of biomedical data are produced in other languages, such as French. Yet a majority of natural language processing or semantic tools as well as domain terminologies or ontologies are only available in English, and cannot be readily applied to other languages, due to fundamental linguistic differences. However, semantic resources are required to design semantic indexes and transform biomedical (text)data into knowledge for better information mining and retrieval. Results We present the SIFR Annotator (http://bioportal.lirmm.fr/annotator), a publicly accessible ontology-based annotation web service to process biomedical text data in French. The service, developed during the Semantic Indexing of French Biomedical Data Resources (2013–2019) project is included in the SIFR BioPortal, an open platform to host French biomedical ontologies and terminologies based on the technology developed by the US National Center for Biomedical Ontology. The portal facilitates use and fostering of ontologies by offering a set of services –search, mappings, metadata, versioning, visualization, recommendation– including for annotation purposes. We introduce the adaptations and improvements made in applying the technology to French as well as a number of language independent additional features –implemented by means of a proxy architecture– in particular annotation scoring and clinical context detection. We evaluate the performance of the SIFR Annotator on different biomedical data, using available French corpora –Quaero (titles from French MEDLINE abstracts and EMEA drug labels) and CépiDC (ICD-10 coding of death certificates)– and discuss our results with respect to the CLEF eHealth information extraction tasks. Conclusions We show the web service performs comparably to other knowledge-based annotation approaches in recognizing entities in biomedical text and reach state-of-the-art levels in clinical context detection (negation, experiencer, temporality). Additionally, the SIFR Annotator is the first openly web accessible tool to annotate and contextualize French biomedical text with ontology concepts leveraging a dictionary currently made of 28 terminologies and ontologies and 333 K concepts. The code is openly available, and we also provide a Docker packaging for easy local deployment to process sensitive (e.g., clinical) data in-house (https://github.com/sifrproject).
机译:摘要背景尽管科学界的广泛采用英语,生物医学数据的显著量在其他语言,如法语生产。然而,大部分的自然语言处理和语义工具以及域的用语或本体是只有英文版本,而不能容易地应用到其他语言,由于基本的语言差异。然而,需要语义资源设计语义索引和改造生物医学(文本)数据转化为知识,为更好的信息挖掘和检索。结果我们目前的SIFR注解者(http://bioportal.lirmm.fr/annotator),可公开访问的基于本体的标注Web服务在法国处理生物医学文本数据。法国生物医学数据资源的语义索引的过程中开发的服务,(2013-2019)的项目包含在SIFR BioPortal,一个开放的平台,基于美国国家中心的生物医学本体开发的技术主机法国生物医学本体和术语。该门户网站有助于使用,通过提供一整套服务促进本体 - 搜索,映射元数据,版本控制,可视化,建议─包括注释的目的。我们介绍将这项技术应用于法国以及一些独立于语言的附加功能所做的调整和改进-implemented通过特别注释得分和临床方面的检测代理建筑 - 的手段。我们评估SIFR注释器在不同的生物医学数据的性能,使用可用法语语料库-Quaero(冠军法国MEDLINE摘要和EMEA药品标签)和CépiDC(ICD-10死亡证明的编码) - 并讨论我们的研究结果与对于CLEF电子健康信息提取任务。结论我们展示了Web服务进行同等的其他知识为基础的注释中承认,在临床方面的检测(否定,体验者,时间性)生物医学文本和达到国家的最先进水平的实体的方法。此外,SIFR标注器是第一个公开网络访问的工具来注释和情境与本体概念利用目前由28个术语和本体和333K温度概念的字典法国生物医学文本。该代码是公开可用的,而且我们还提供一个码头工人包装,方便本地部署处理敏感(例如,临床)数据的内部(https://github.com/sifrproject)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号