【24h】

AUTOMATIC QUERY EXPANSION BASED ON FUZZY THESAURUS FOR INFORMATION RETRIEVAL

机译:基于模糊词库的信息检索自动查询扩展

获取原文
获取外文期刊封面目录资料

摘要

Query expansion (QE) has been proved to be one of effective methods for improving the performance of the information retrieval (IR) system. Query expansion in the traditional keyword-based search methods is appending new associated words or terms to the separated words that are used to represent the user's query. However introducing such separated words into the search process would return a number of irrelevant documents as results to the user. This motivates us to combine the modifier with its corresponding headword. The combined term is then employed for the following search process. So expanded terms in our method are not attached to the separated words/terms but the combined terms. As a consequence, more relevant documents can be found out, and the precision and recall of the IR system will be increased by use of the proposed query expansion method. Quite often, the expansion process is conducted with the use of a thesaurus that contains a group of synonyms for each term. Considering that not all the synonyms are with the equal importance for a word/term, they are usually only related to each other to some degree. To appropriately specify the degree of relationships between two synonymous words/terms stored in the thesaurus, constructing a fuzzy thesaurus is a good choice we think, in which the grade of relevance for each pair of related words/terms can be mapped into the interval. By using the developed fuzzy synonym thesaurus, the combined term can be expanded. Through the proposed approaches, we can retrieve relevant documents in a relatively narrow search space and meanwhile widen the coverage of the original query.
机译:已被证明查询扩展(QE)是提高信息检索(IR)系统性能的有效方法之一。在传统的基于关键字的搜索方法中查询扩展是将新的关联单词或术语附加到用于表示用户查询的分隔单词。然而,将这种分离的单词引入搜索过程将返回一些不相关的文件作为用户的结果。这使我们能够将修改器与其相应的头字组合。然后采用组合项用于以下搜索过程。因此,我们的方法中的扩展术语未附加到分隔的单词/术语,而是组合条款。因此,可以找到更多相关文件,并且通过使用所提出的查询扩展方法,将增加IR系统的精度和召回。通常,通过使用包含每个术语的同义词组的词库进行扩展过程。考虑到并非所有的同义词都具有与单词/术语相同的同义词,它们通常仅在某种程度上彼此相关。为了适当地指定存储在词库中的两个同义词/术语之间的关系程度,构建模糊的词库是我们认为的一个不错的选择,其中每对相关单词/术语的相关性可以映射到间隔。通过使用开发的模糊同义词同义词库,可以扩展组合项。通过提出的方法,我们可以在相对较窄的搜索空间中检索相关文件,同时扩大原始查询的覆盖范围。

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号