【24h】

An Information Retrieval Model Based On Word Concept

机译:基于词概念的信息检索模型

获取原文
获取原文并翻译 | 示例

摘要

Traditional approaches for information retrieval from texts depend on the term frequency. A shortcoming of these schemes, which consider only occurrences of the terms in a document, is that they have some limitations on extracting semantically exact indexes that represent the semantic content of a document. However, one word can always represent more than one meaning. The word sense ambiguities will also affect the system behavior. To address this issue, we proposed a brand new strategy - a concept extracting strategy to extract the concept of the word and to determine the semantic importance of the concepts in the sentences via analyzing the conceptual structures of the sentences. In this approach, a conceptual vector space model using auto-threshold detection is proposed to process the concepts, and a cluster searching model is also designed. This auto-threshold detection method can help the model to obtain the optimal settings of retrieval parameters automatically. An experiment on the TREC6 collection shows that the proposed method outperforms the other two information retrieval (IR) methods based on term frequency (TF), especially for the lower-ranked documents
机译:从文本检索信息的传统方法取决于术语频率。这些仅考虑术语在文档中出现的方案的缺点是,它们在提取表示文档语义内容的语义精确索引方面有一些限制。但是,一个词总是可以代表不止一种含义。词义歧义也将影响系统行为。为了解决这个问题,我们提出了一种全新的策略-一种概念提取策略,通过分析句子的概念结构来提取单词的概念并确定句子中概念的语义重要性。在这种方法中,提出了一种使用自动阈值检测的概念向量空间模型来处理这些概念,并设计了一个聚类搜索模型。这种自动阈值检测方法可以帮助模型自动获取检索参数的最佳设置。对TREC6集合的实验表明,该方法优于其他两种基于词频(TF)的信息检索(IR)方法,尤其是对于排名较低的文档

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号