首页> 外文期刊>Computer speech and language >Should one use term proximity or multi-word terms for Arabic information retrieval?
【24h】

Should one use term proximity or multi-word terms for Arabic information retrieval?

机译:应该使用术语接近或多字词用于阿拉伯语信息检索吗?

获取原文
获取原文并翻译 | 示例
           

摘要

Recently, several information retrieval (IR) models have been proposed in order to boost the retrieval performance using term dependencies. However, in the context of the Arabic language, most IR researchers have focused on the problem of stemming, which is highly challenging in this language. In this paper, we propose to explore whether term dependencies can help improve Arabic IR systems, and what are the best methods to use. To do so, we consider both explicit term dependencies based on multi-word terms (MWTs) that are extracted using syntactic patterns and statistical filters, as well as implicit ones based on the notion of cross-terms or term proximities. Our experiments, performed on standard TREC Arabic IR collections, show the importance of taking into account term dependencies for Arabic IR. To the best of our knowledge, this is the first study that provides complete extensions, and their comparison, of most standard IR models to deal with term dependencies in the Arabic language. (C) 2019 Elsevier Ltd. All rights reserved.
机译:最近,已经提出了几种信息检索(IR)模型,以便使用术语依赖性提高检索性能。然而,在阿拉伯语的背景下,大多数IR的研究人员都专注于威胁的问题,这在这种语言方面非常具有挑战性。在本文中,我们建议探索依赖性是否可以帮助改善阿拉伯IR系统,以及最佳使用方法。为此,我们考虑基于使用句法模式和统计过滤器的多字术语(MWT)的显式术语依赖关系,以及基于跨术语或术语概要的概念的隐式界限。我们的实验在标准TREC阿拉伯语红外IR集合中表现出了考虑阿拉伯语IR的依赖关系的重要性。据我们所知,这是第一项研究,提供完整的扩展和他们的比较,以及大多数标准的IR模型来处理阿拉伯语中的术语依赖性。 (c)2019 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号