首页> 外文期刊>Aslib Proceedings >A study of user profile representation for personalized cross-language information retrieval
【24h】

A study of user profile representation for personalized cross-language information retrieval

机译:用于个性化跨语言信息检索的用户配置文件表示的研究

获取原文
获取原文并翻译 | 示例
       

摘要

Purpose - With an increase in the amount of multilingual content on the World Wide Web, users are often striving to access information provided in a language of which they are non-native speakers. The purpose of this paper is to present a comprehensive study of user profile representation techniques and investigate their use in personalized cross-language information retrieval (CLIR) systems through the means of personalized query expansion. Design/methodology/approach - The user profiles consist of weighted terms computed by using frequency-based methods such as tf-idf and BM25, as well as various latent semantic models trained on monolingual documents and cross-lingual comparable documents. This paper also proposes an automatic evaluation method for comparing various user profile generation techniques and query expansion methods. Findings - Experimental results suggest that latent semantic-weighted user profile representation techniques are superior to frequency-based methods, and are particularly suitable for users with a sufficient amount of historical data. The study also confirmed that user profiles represented by latent semantic models trained on a cross-lingual level gained better performance than the models trained on a monolingual level. Originality/value - Previous studies on personalized information retrieval systems have primarily investigated user profiles and personalization strategies on a monolingual level. The effect of utilizing such monolingual profiles for personalized CLIR remains unclear. The current study fills the gap by a comprehensive study of user profile representation for personalized CLIR and a novel personalized CLIR evaluation methodology to ensure repeatable and controlled experiments can be conducted.
机译:目的-随着万维网上多语种内容数量的增加,用户经常努力访问以其母语不是母语的人提供的信息。本文的目的是对用户个人资料表示技术进行全面的研究,并通过个性化查询扩展手段研究它们在个性化跨语言信息检索(CLIR)系统中的使用。设计/方法/方法-用户配置文件由使用基于频率的方法(例如tf-idf和BM25)以及在单语文档和跨语言可比文档中训练的各种潜在语义模型计算得出的加权术语组成。本文还提出了一种自动评估方法,用于比较各种用户配置文件生成技术和查询扩展方法。发现-实验结果表明,潜在的语义加权用户配置文件表示技术优于基于频率的方法,并且特别适合具有足够历史数据量的用户。该研究还证实,以跨语言水平训练的潜在语义模型表示的用户配置文件比以单语言水平训练的模型获得更好的性能。原创性/价值-以前有关个性化信息检索系统的研究主要是在单语言水平上研究了用户资料和个性化策略。对于个性化CLIR使用这种单语言配置文件的效果仍然不清楚。当前的研究通过对个性化CLIR的用户配置文件表示的全面研究和新颖的个性化CLIR评估方法来填补空白,以确保可以进行可重复和受控的实验。

著录项

  • 来源
    《Aslib Proceedings》 |2016年第4期|448-477|共30页
  • 作者单位

    School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, China;

    School of Computer Science and Statistics, Trinity College Dublin, Dublin, Ireland;

    School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, China;

    School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, China;

    School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, China;

  • 收录信息 美国《科学引文索引》(SCI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Query expansion; Personalization; Automatic evaluation; Cross-language information retrieval; Topic models; User profile representation;

    机译:查询扩展;个性化;自动评估;跨语言信息检索;主题模型;用户个人资料表示;
  • 入库时间 2022-08-17 23:15:36

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号