首页> 外文期刊>Bioinformatics >Extending ontologies by finding siblings using set expansion techniques
【24h】

Extending ontologies by finding siblings using set expansion techniques

机译:通过使用集合扩展技术找到同级来扩展本体

获取原文
获取原文并翻译 | 示例
       

摘要

Motivation: Ontologies are an everyday tool in biomedicine to capture and represent knowledge. However, many ontologies lack a high degree of coverage in their domain and need to improve their overall quality and maturity. Automatically extending sets of existing terms will enable ontology engineers to systematically improve text-based ontologies level by level. Results: We developed an approach to extend ontologies by discovering new terms which are in a sibling relationship to existing terms of an ontology. For this purpose, we combined two approaches which retrieve new terms from the web. The first approach extracts siblings by exploiting the structure of HTML documents, whereas the second approach uses text mining techniques to extract siblings from unstructured text. Our evaluation against MeSH (Medical Subject Headings) shows that our method for sibling discovery is able to suggest first-class ontology terms and can be used as an initial step towards assessing the completeness of ontologies. The evaluation yields a recall of 80% at a precision of 61% where the two independent approaches are complementing each other. For MeSH in particular, we show that it can be considered complete in its medical focus area. We integrated the work into DOG4DAG, an ontology generation plugin for the editors OBO-Edit and Protege, making it the first plugin that supports sibling discovery on-the-fly.
机译:动机:本体是生物医学中捕获和表示知识的日常工具。但是,许多本体在其领域中缺乏高度的覆盖范围,因此需要提高其整体质量和成熟度。自动扩展现有术语集将使本体工程师能够逐级系统地改进基于文本的本体。结果:我们开发了一种方法,通过发现与本体的现有术语有同级关系的新术语来扩展本体。为此,我们结合了两种从Web检索新术语的方法。第一种方法通过利用HTML文档的结构来提取同级,而第二种方法使用文本挖掘技术从非结构化文本中提取同级。我们对MeSH(医学主题词)的评估表明,我们的同级发现方法能够建议一流的本体术语,并且可以用作评估本体完整性的第一步。两种独立方法相互补充时,评估得出的召回率为80%,准确率为61%。特别是对于MeSH,我们证明在医学重点领域可以认为它是完整的。我们将工作整合到DOG4DAG中,DOG4DAG是用于OBO-Edit和Protege编辑器的本体生成插件,从而使其成为第一个支持即时同级发现的插件。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号