...
首页> 外文期刊>Data & Knowledge Engineering >ATOLL-A framework for the automatic induction of ontology lexica
【24h】

ATOLL-A framework for the automatic induction of ontology lexica

机译:ATOLL-A自动生成本体词典的框架

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

There is a range of large knowledge bases, such as Freebase and DBpedia, as well as linked data sets available on the web, but they typically lack lexical information stating how the properties and classes they comprise are realized lexically. Often only one label is attached, if at all, thus lacking rich linguistic information, e.g. about morphological forms, syntactic arguments or possible lexical variants and paraphrases. While ontology lexicon models like lemon allow for defining such linguistic information with respect to a given ontology, the cost involved in creating and maintaining such lexica is substantial, requiring a high manual effort. Towards lowering this effort we present ATOLL, a framework for the automatic induction of ontology lexica, based both on existing labels and on dependency paths extracted from a text corpus. We instantiate ATOLL with respect to DBpedia as dataset and Wikipedia as corresponding corpus, and evaluate it by comparing the automatically generated lexicon with a manually constructed one. Our results clearly corroborate that our approach shows a high potential to be applied in a semi-automatic fashion in which a lexicon engineer can validate, reject or refine the automatically generated lexical entries, thus having a clear potential to contributing to the reduction of the overall cost of creating ontology lexica.
机译:有大量的大型知识库,例如Freebase和DBpedia,以及Web上可用的链接数据集,但是它们通常缺少词汇信息,这些信息说明了如何以词法实现它们所包含的属性和类。通常,如果仅附上一个标签,则缺乏丰富的语言信息,例如关于形态形式,句法论点或可能的词汇变体和释义。尽管像柠檬这样的本体词典模型允许相对于给定本体定义此类语言信息,但是创建和维护此类词典所涉及的成本是巨大的,需要大量的人工。为了减少这项工作,我们提出了ATOLL,这是一个基于现有标签和从文本语料库中提取的依赖路径自动生成本体词典的框架。我们以DBpedia作为数据集和Wikipedia作为相应的语料库实例化ATOLL,并通过将自动生成的词典与手动构建的词典进行比较来对其进行评估。我们的结果清楚地证明了我们的方法显示出以半自动方式应用的巨大潜力,在这种方式下,词典工程师可以验证,拒绝或改进自动生成的词汇条目,从而具有明显的潜力,有助于减少总体创建本体词典的成本。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号