【24h】

From Web Directories to Ontologies: Natural Language Processing Challenges

机译:从网络目录到本体:自然语言处理的挑战

获取原文
获取原文并翻译 | 示例

摘要

Hierarchical classifications are used pervasively by humans as a means to organize their data and knowledge about the world. One of their main advantages is that natural language labels, used to describe their contents, are easily understood by human users. However, at the same time, this is also one of their main disadvantages as these same labels are ambiguous and very hard to be reasoned about by software agents. This fact creates an insuperable hindrance for classifications to being embedded in the Semantic Web infrastructure. This paper presents an approach to converting classifications into lightweight ontologies, and it makes the following contributions: (ⅰ) it identifies the main NLP problems related to the conversion process and shows how they are different from the classical problems of NLP; (ⅱ) it proposes heuristic solutions to these problems, which are especially effective in this domain; and (ⅲ) it evaluates the proposed solutions by testing them on DMoz data.
机译:分层分类被人类广泛使用,作为组织其关于世界的数据和知识的一种手段。它们的主要优势之一是人类用户易于理解用于描述其内容的自然语言标签。但是,与此同时,这也是它们的主要缺点之一,因为这些相同的标签含糊不清并且很难由软件代理进行推理。这一事实为将分类嵌入语义Web基础结构中造成了不可克服的障碍。本文提出了一种将分类转换为轻量级本体的方法,并做出了以下贡献:(ⅰ)识别与转换过程有关的主要NLP问题,并说明它们与NLP的经典问题有何不同; (ⅱ)针对这些问题提出了启发式解决方案,在这一领域特别有效; (ⅲ)通过对DMoz数据进行测试来评估所提出的解决方案。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号