首页> 外文期刊>Applied Artificial Intelligence >Query Expansion for Effective Retrieval Results of Hindi-English Cross-Lingual IR
【24h】

Query Expansion for Effective Retrieval Results of Hindi-English Cross-Lingual IR

机译:查询扩展,以便有效检索结果的印地文 - 英语交叉语言

获取原文
获取原文并翻译 | 示例

摘要

Information retrieval (IR) is the science of identifying documents or sub-documents from a collection of information or database. The collection of information does not necessarily be available in only one language as information does not depend on languages. Monolingual IR is the process of retrieving information in query language whereas cross-lingual information retrieval (CLIR) is the process of retrieving information in a language that differs from query language. In current scenario, there is a strong demand of CLIR system because it allows the user to expand the international scope of searching a relevant document. As compared to monolingual IR, one of the biggest problems of CLIR is poor retrieval performance that occurs due to query mismatching, multiple representations of query terms and untranslated query terms. Query expansion (QE) is the process or technique of adding related terms to the original query for query reformulation. Purpose of QE is to improve the performance and quality of retrieved information in CLIR system. In this paper, QE has been explored for a Hindi-English CLIR in which Hindi queries are used to search English documents. We used Okapi BM25 for documents ranking, and then by using term selection value, translated queries have been expanded. All experiments have been performed using FIRE 2012 dataset. Our result shows that the relevancy of Hindi-English CLIR can be improved by adding the lowest frequency term.
机译:信息检索(IR)是从信息或数据库集合识别文档或子文档的科学。信息集合不一定只能以一种语言提供,因为信息不依赖于语言。 Monolingual IR是检索查询语言中的信息的过程,而交叉语言信息检索(CLIR)是检索从查询语言不同的语言检索信息的过程。在目前的情景中,CLIR系统的需求很大,因为它允许用户扩展搜索相关文件的国际范围。与Monolingual IR相比,CLIR的最大问题之一是由于查询错配,查询术语的多个表示和未转换查询术语而发生的可检索性能差。查询扩展(QE)是向原始查询添加相关术语的进程或技术,用于查询重构。 QE的目的是提高CLIR系统中检索信息的性能和质量。在本文中,已探讨了QE的印度英语CLIR,其中印地文查询用于搜索英语文件。我们使用Okapi BM25为文档排名,然后使用术语选择值,已扩展翻译查询。所有实验都是使用Fire 2012数据集进行的。我们的结果表明,通过添加最低频率术语,可以提高印地语英语CLIR的相关性。

著录项

  • 来源
    《Applied Artificial Intelligence》 |2019年第8期|567-593|共27页
  • 作者单位

    BBA A Cent Univ Dept Comp Sci Lucknow 226025 Uttar Pradesh India;

    BBA A Cent Univ Dept Comp Sci Lucknow 226025 Uttar Pradesh India;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号