...
首页> 外文期刊>The Journal of Systems and Software >A novel semantic information retrieval system based on a three-level domain model
【24h】

A novel semantic information retrieval system based on a three-level domain model

机译:基于三级域模型的新型语义信息检索系统

获取原文
获取原文并翻译 | 示例

摘要

This paper presents a methodology and a prototype for extracting and indexing knowledge from natural language documents. The underlying domain model relies on a conceptual level (described by means of a domain ontology), which represents the domain knowledge, and a lexical level (based on WordNet), which represents the domain vocabulary. A stochastic model (the ME-2L-HMM2, which mixes - in a novel way - HMM and maximum entropy models) stores the mapping between such levels, taking into account the linguistic context of words. Not only does such a context contain the surrounding words; it also contains morphologic and syntactic information extracted using natural language processing tools. The stochastic model is then used, during the document indexing phase, to disambiguate word meanings. The semantic information retrieval engine we developed supports simple keyword-based queries, as well as natural language-based queries. The engine is also able to extend the domain knowledge, discovering new and relevant concepts to add to the domain model. The validation tests indicate that the system is able to disambiguate and extract concepts with good accuracy. A comparison between our prototype and a classic search engine shows that the proposed approach is effective in providing better accuracy.
机译:本文提出了一种从自然语言文档中提取和索引知识的方法和原型。基础领域模型依赖于代表领域知识的概念层次(通过领域本体描述)和代表领域词汇的词汇层次(基于WordNet)。随机模型(ME-2L-HMM2以一种新颖的方式将HMM和最大熵模型混合在一起)存储了这些级别之间的映射,同时考虑了单词的语言环境。这样的上下文不仅包含周围的单词,而且还包含周围的单词。它还包含使用自然语言处理工具提取的形态和句法信息。然后在文档索引编制阶段,使用随机模型来消除单词的歧义。我们开发的语义信息检索引擎支持简单的基于关键字的查询以及基于自然语言的查询。该引擎还能够扩展领域知识,发现新的相关概念以添加到领域模型中。验证测试表明,该系统能够消除歧义并准确提取概念。我们的原型与经典搜索引擎之间的比较表明,所提出的方法可有效提高准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号