首页> 外文期刊>ACM transactions on Asian language information processing >Towards Effective Strategies for Monolingual and Bilingual Information Retrieval: Lessons Learned from NTCIR-4
【24h】

Towards Effective Strategies for Monolingual and Bilingual Information Retrieval: Lessons Learned from NTCIR-4

机译:制定有效的单语和双语信息检索策略:NTCIR-4的经验教训

获取原文
获取原文并翻译 | 示例
       

摘要

At the NTCIR-4 workshop, Justsystem Corporation (JSC) and Clairvoyance Corporation (CC) collaborated in the cross-language retrieval task (CLIR). Our goal was to evaluate the performance and robustness of our recently developed commercial-grade CLIR systems for English and Asian languages. The main contribution of this article is the investigation of different strategies, their interactions in both monolingual and bilingual retrieval tasks, and their respective contributions to operational retrieval systems in the context of NTCIR-4. We report results of Japanese and English monolingual retrieval and results of Japanese-to-English bilingual retrieval. In monolingual retrieval analysis, we examine two special properties of the NTCIR experimental design (two levels of relevance and identical queries in multiple languages) and explore how they interact with strategies of our retrieval system, including pseudo-relevance feedback, multi-word term down-weighting, and term weight merging strategies. Our analysis shows that the choice of language (English or Japanese) does not have a significant impact on retrieval performance. Query expansion is slightly more effective with relaxed judgments than with rigid judgments. For better retrieval performance, weights of multi-word terms should be lowered. In the bilingual retrieval analysis, we aim to identify robust strategies that are effective when used alone and when used in combination with other strategies. We examine cross-lingual specific strategies such as translation disambiguation and translation structuring, as well as general strategies such as pseudo-relevance feedback and multi-word term down-weighting. For shorter title topics, pseudo-relevance feedback is a major performance enhancer, but translation structuring affects retrieval performance negatively when used alone or in combination with other strategies. All experimented strategies improve retrieval performance for the longer description topics, with pseudo-relevance feedback and translation structuring as the major contributors.
机译:在NTCIR-4研讨会上,Justsystem Corporation(JSC)和Clairvoyance Corporation(CC)合作完成了跨语言检索任务(CLIR)。我们的目标是评估我们最近开发的用于英语和亚洲语言的商业级CLIR系统的性能和鲁棒性。本文的主要贡献是对不同策略的研究,它们在单语和双语检索任务中的相互作用以及它们在NTCIR-4上下文中对操作检索系统的贡献。我们报告日语和英语双语检索的结果以及日语到英语双语检索的结果。在单语检索分析中,我们检查了NTCIR实验设计的两个特殊属性(两个级别的相关性和相同的多语言查询),并探讨它们如何与我们的检索系统的策略进行交互,包括伪相关性反馈,多词检索-加权和术语权重合并策略。我们的分析表明,语言(英语或日语)的选择对检索性能没有重大影响。使用宽松的判断比使用严格的判断,查询扩展更为有效。为了获得更好的检索性能,应降低多字词的权重。在双语检索分析中,我们旨在确定健壮的策略,这些策略在单独使用或与其他策略结合使用时有效。我们研究了跨语言的特定策略,例如翻译歧义消除和翻译结构化,以及一般策略,例如伪相关反馈和多词术语权重降低。对于较短的标题主题,伪相关反馈是主要的性能增强器,但是当单独使用或与其他策略结合使用时,翻译结构会对检索性能产生负面影响。所有经过实验的策略都改善了较长描述主题的检索性能,其中伪相关反馈和翻译结构是主要的贡献者。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号