首页> 美国卫生研究院文献>Bioinformatics >Extending ontologies by finding siblings using set expansion techniques
【2h】

Extending ontologies by finding siblings using set expansion techniques

机译:通过使用集合扩展技术找到同胞来扩展本体

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

>Motivation: Ontologies are an everyday tool in biomedicine to capture and represent knowledge. However, many ontologies lack a high degree of coverage in their domain and need to improve their overall quality and maturity. Automatically extending sets of existing terms will enable ontology engineers to systematically improve text-based ontologies level by level.>Results: We developed an approach to extend ontologies by discovering new terms which are in a sibling relationship to existing terms of an ontology. For this purpose, we combined two approaches which retrieve new terms from the web. The first approach extracts siblings by exploiting the structure of HTML documents, whereas the second approach uses text mining techniques to extract siblings from unstructured text. Our evaluation against MeSH (Medical Subject Headings) shows that our method for sibling discovery is able to suggest first-class ontology terms and can be used as an initial step towards assessing the completeness of ontologies. The evaluation yields a recall of 80% at a precision of 61% where the two independent approaches are complementing each other. For MeSH in particular, we show that it can be considered complete in its medical focus area. We integrated the work into DOG4DAG, an ontology generation plugin for the editors OBO-Edit and Protégé, making it the first plugin that supports sibling discovery on-the-fly.>Availability: Sibling discovery for ontology is available as part of DOG4DAG () for both Protégé 4.1 and OBO-Edit 2.1.>Contact: ; >Supplementary information: are available at Bioinformatics online.
机译:>动机:本体是生物医学中捕获和表示知识的日常工具。但是,许多本体在其领域中缺乏高度的覆盖范围,因此需要提高其整体质量和成熟度。自动扩展现有术语集将使本体工程师能够逐级系统地改进基于文本的本体。>结果:我们开发了一种方法,可通过发现与现有术语具有同等关系的新术语来扩展本体。本体。为此,我们结合了两种从Web检索新术语的方法。第一种方法通过利用HTML文档的结构来提取同级,而第二种方法使用文本挖掘技术从非结构化文本中提取同级。我们对MeSH(医学主题词)的评估表明,我们的同级发现方法能够建议一流的本体术语,并且可以用作评估本体完整性的第一步。两种独立方法相互补充时,评估得出的召回率为80%,准确率为61%。特别是对于MeSH,我们证明在医疗重点领域可以认为它是完整的。我们将工作整合到DOG4DAG中,DOG4DAG是用于OBO-Edit和Protégé编辑器的本体生成插件,从而使其成为第一个支持即时同级发现的插件。>可用性:作为Progégé4.1和OBO-Edit 2.1的DOG4DAG()的一部分。>联系方式:; >补充信息:可在线访问生物信息学。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号