首页> 外文期刊>Journal of biomedical informatics. >Reuse of termino-ontological resources and text corpora for building a multilingual domain ontology: An application to Alzheimer's disease
【24h】

Reuse of termino-ontological resources and text corpora for building a multilingual domain ontology: An application to Alzheimer's disease

机译:重复使用术语本体资源和文本语料库来构建多语言领域本体:在阿尔茨海默氏病中的应用

获取原文
获取原文并翻译 | 示例
       

摘要

Ontologies are useful tools for sharing and exchanging knowledge. However ontology construction is complex and often time consuming. In this paper, we present a method for building a bilingual domain ontology from textual and termino-ontological resources intended for semantic annotation and information retrieval of textual documents. This method combines two approaches: ontology learning from texts and the reuse of existing terminological resources. It consists of four steps: (i) term extraction from domain specific corpora (in French and English) using textual analysis tools, (ii) clustering of terms into concepts organized according to the UMLS Metathesaurus, (iii) ontology enrichment through the alignment of French and English terms using parallel corpora and the integration of new concepts, (iv) refinement and validation of results by domain experts. These validated results are formalized into a domain ontology dedicated to Alzheimer's disease and related syndromes which is available online (http://lesi-m.isped.u-bordeaux2.fr/SemBiP/ressources/ontoAD.owl). The latter currently includes 5765 concepts linked by 7499 taxonomic relationships and 10,889 non-taxonomic relationships. Among these results, 439 concepts absent from the UMLS were created and 608 new synonymous French terms were added. The proposed method is sufficiently flexible to be applied to other domains.
机译:本体是共享和交流知识的有用工具。然而,本体的构造是复杂的并且通常是耗时的。在本文中,我们提出了一种从文本和术语本体资源构建双语领域本体的方法,该资源旨在用于文本文档的语义注释和信息检索。这种方法结合了两种方法:从文本中进行本体学习和现有术语资源的重用。它包括四个步骤:(i)使用文本分析工具从特定领域的语料库(法语和英语)中提取术语,(ii)将术语聚类为根据UMLS Metathesaurus组织的概念,(iii)通过对齐法语和英语术语使用并行语料库和新概念的集成,(iv)领域专家完善和验证结果。这些经过验证的结果已正式定型为专门针对阿尔茨海默氏病和相关综合症的领域本体,该本体可在线获取(http://lesi-m.isped.u-bordeaux2.fr/SemBiP/ressources/ontoAD.owl)。后者目前包括5765个概念,这些概念通过7499个分类关系和10889个非分类关系链接在一起。在这些结果中,创建了UMLS缺少的439个概念,并添加了608个新的法文同义词。所提出的方法具有足够的灵活性,可以应用于其他领域。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号