【24h】

AUTOMATIC QUERY EXPANSION BASED ON FUZZY THESAURUS FOR INFORMATION RETRIEVAL

机译:基于模糊叙词表的信息自动检索扩展

获取原文
获取原文并翻译 | 示例

摘要

Query expansion (QE) has been proved to be one of effective methods for improving the performance of the information retrieval (IR) system. Query expansion in the traditional keyword-based search methods is appending new associated words or terms to the separated words that are used to represent the user's query. However introducing such separated words into the search process would return a number of irrelevant documents as results to the user. This motivates us to combine the modifier with its corresponding headword. The combined term is then employed for the following search process. So expanded terms in our method are not attached to the separated words/terms but the combined terms. As a consequence, more relevant documents can be found out, and the precision and recall of the IR system will be increased by use of the proposed query expansion method. Quite often, the expansion process is conducted with the use of a thesaurus that contains a group of synonyms for each term. Considering that not all the synonyms are with the equal importance for a word/term, they are usually only related to each other to some degree. To appropriately specify the degree of relationships between two synonymous words/terms stored in the thesaurus, constructing a fuzzy thesaurus is a good choice we think, in which the grade of relevance for each pair of related words/terms can be mapped into the interval. By using the developed fuzzy synonym thesaurus, the combined term can be expanded. Through the proposed approaches, we can retrieve relevant documents in a relatively narrow search space and meanwhile widen the coverage of the original query.
机译:查询扩展(QE)已被证明是提高信息检索(IR)系统性能的有效方法之一。传统的基于关键字的搜索方法中的查询扩展是将新的关联词或术语附加到用于表示用户查询的分隔词中。然而,将这种分离的单词引入搜索过程将向用户返回许多不相关的文档作为结果。这促使我们将修饰符与其相应的词条结合起来。然后将合并的术语用于以下搜索过程。因此,我们方法中的扩展术语不会附加到单独的单词/术语上,而是附加到组合的术语上。结果,可以找到更多相关的文档,并且通过使用建议的查询扩展方法可以提高IR系统的精度和召回率。通常,扩展过程是使用一个词库进行的,该词库包含每个术语的一组同义词。考虑到并非所有同义词对于一个词/术语都具有同等的重要性,因此它们通常仅在某种程度上相互关联。为了适当地指定存储在同义词库中的两个同义词/术语之间的关系程度,我们认为构建模糊词库是一个不错的选择,其中可以将每对相关词/术语的相关性等级映射到区间中。通过使用开发的模糊同义词同义词库,可以扩展组合术语。通过提出的方法,我们可以在相对狭窄的搜索空间中检索相关文档,同时扩大原始查询的覆盖范围。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号