首页> 美国卫生研究院文献>other >Complex overlapping concepts: An effective auditing methodology for families of similarly structured BioPortal ontologies
【2h】

Complex overlapping concepts: An effective auditing methodology for families of similarly structured BioPortal ontologies

机译:复杂的重叠概念:针对结构相似的BioPortal本体族的有效审核方法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

I n previous research, we have demonstrated for a number of ontologies that structurally complex concepts (for different definitions of “complex”) in an ontology are more likely to exhibit errors than other concepts. Thus, such complex concepts often become fertile ground for quality assurance (QA) in ontologies. They should be audited first. One example of complex concepts is given by “overlapping concepts” (to be defined below.) Historically, a different auditing methodology had to be developed for every single ontology. For better scalability and efficiency, it is desirable to identify family-wide QA methodologies. Each such methodology would be applicable to a whole family of similar ontologies. In past research, we had divided the 685 ontologies of BioPortal into families of structurally similar ontologies. We showed for four ontologies of the same large family in BioPortal that “overlapping concepts” are indeed statistically significantly more likely to exhibit errors. In order to make an authoritative statement concerning the success of “overlapping concepts” as a methodology for a whole family of similar ontologies (or of large subhierarchies of ontologies), it is necessary to show that “overlapping concepts” have a higher likelihood of errors for six out of six ontologies of the family. In this paper, we are demonstrating for two more ontologies that “overlapping concepts” can successfully predict groups of concepts with a higher error rate than concepts from a control group. The fifth ontology is the Neoplasm subhierarchy of the National Cancer Institute thesaurus (NCIt). The sixth ontology is the Infectious Disease subhierarchy of SNOMED CT. We demonstrate quality assurance results for both of them. Furthermore, in this paper we observe two novel, important, and useful phenomena during quality assurance of “overlapping concepts.” First, an erroneous “overlapping concept” can help with discovering other erroneous “non-overlapping concepts” in its vicinity. Secondly, correcting erroneous “overlapping concepts” may turn them into “non-overlapping concepts.” We demonstrate that this may reduce the complexity of parts of the ontology, which in turn makes the ontology more comprehensible, simplifying maintenance and use of the ontology.
机译:在先前的研究中,我们已经为许多本体论证明,本体论中结构复杂的概念(针对“复杂”的不同定义)比其他概念更容易表现出错误。因此,这种复杂的概念通常成为本体质量保证(QA)的沃土。应该首先对它们进行审核。复杂概念的一个示例由“重叠概念”(将在下面定义)给出。从历史上看,必须为每个单个本体开发不同的审核方法。为了获得更好的可扩展性和效率,希望确定整个家庭范围内的质量检查方法。每种这样的方法论都适用于整个类似本体家族。在过去的研究中,我们将685个BioPortal本体分为结构相似的本体家族。我们在BioPortal中针对同一大家族的四个本体论表明,“重叠概念”确实在统计学上确实更有可能表现出错误。为了对“重叠概念”作为整个类似本体论(或大型本体子层次结构)方法论的成功作出权威性的陈述,有必要证明“重叠概念”具有更高的错误可能性在家庭的六种本体中占六种。在本文中,我们将说明另外两种本体,即“重叠的概念”可以成功地预测比控制组中的概念错误率更高的概念组。第五个本体是美国国家癌症研究所词库(NCIt)的肿瘤亚层次。第六个本体是SNOMED CT的传染病子层次。我们展示了两者的质量保证结果。此外,在本文中,我们在“重叠概念”的质量保证期间观察到两个新颖,重要且有用的现象。首先,错误的“重叠概念”可以帮助发现附近的其他错误的“非重叠概念”。其次,纠正错误的“重叠概念”可能会使它们变成“非重叠概念”。我们证明这可以降低本体部分的复杂性,从而使本体更易于理解,简化了本体的维护和使用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号