首页> 外文期刊>Cognitive Systems Research >A cognitive inspired unsupervised language-independent text stemmer for Information retrieval
【24h】

A cognitive inspired unsupervised language-independent text stemmer for Information retrieval

机译:认知启发式无监督语言无关文本提取器,用于信息检索

获取原文
获取原文并翻译 | 示例
           

摘要

In Information Retrieval systems, stemming handles the words that can occur in different morphological forms, and hence matches the terms of the documents and the queries that are related in meanings. In this article, we have proposed a cognitive inspired language-independent stemming that learns group of morphologically related words from the ambient corpus without any linguistic knowledge or human intervention and it behaves in a way the human brain works. The main idea of our proposed algorithm is to determine only those variants of the words from the ambient corpus that match the original intent of the query terms. We conducted ad-hoc retrieval experiments in a number of languages of varying morphological complexity using standard TREC, FIRE, and CLEF document collection. The results indicate that stemming improves the retrieval accuracy and the effectiveness of stemming algorithm increases with the increase in the morphological complexity of algorithm. The results also indicates that the performance of our proposed algorithm is better than the stemmers based on linguistic knowledge and other state-of-the-art statistical stemmers in almost all the languages under study. In multi-lingual setup these results are quite encouraging. (C) 2018 Elsevier B.V. All rights reserved.
机译:在信息检索系统中,词干处理可能以不同形态形式出现的单词,因此匹配文档的术语和含义相关的查询。在本文中,我们提出了一种与语言无关的认知启发词干,该词干无需任何语言知识或人工干预即可从周围语料库中学习与形态学相关的词组,并且其行为方式与人脑的工作方式相同。我们提出的算法的主要思想是仅从环境语料库中确定与查询词的原始意图匹配的那些词的变体。我们使用标准TREC,FIRE和CLEF文档收集,以多种形态复杂程度不同的语言进行了临时检索实验。结果表明,词干提高了检索的准确性,并且词干算法的有效性随着算法形态复杂度的增加而提高。结果还表明,在几乎所有正在研究的语言中,我们提出的算法的性能均优于基于语言知识和其他最新统计统计词干的词干。在多语言设置中,这些结果令人鼓舞。 (C)2018 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号