首页> 外文会议>NLDB 2013 >Exploiting Query Logs and Field-Based Models to Address Term Mismatch in an HIV/AIDS FAQ Retrieval System
【24h】

Exploiting Query Logs and Field-Based Models to Address Term Mismatch in an HIV/AIDS FAQ Retrieval System

机译:利用基于日志和基于现场的模型,以解决艾滋病毒/艾滋病常见问题解答检索系统中的术语不匹配

获取原文

摘要

One of the main challenges in the retrieval of Frequently Asked Questions (FAQ) is that the terms used by information seekers to express their information need are often different from those used in the relevant FAQ documents. This lexical disagreement (aka term mismatch) can result in a less effective ranking of the relevant FAQ documents by retrieval systems that rely on keyword matching in their weighting models. In this paper, we tackle such a lexical gap in an SMSBased HIV/AIDS FAQ retrieval system by enriching the traditional FAQ document representation using terms from a query log, which are added as a separate field in a field-based model. We evaluate our approach using a collection of FAQ documents produced by a national health service and a corresponding query log collected over a period of 3 months. Our results suggest that by enriching the FAQ documents with additional terms from the SMS queries for which the true relevant FAQ documents are known and combining term frequencies from the different fields, the lexical mismatch problem in our system is markedly alleviated, leading to an overall improvement in the retrieval performance in terms of Mean Reciprocal Rank (MRR) and recall.
机译:检索常见问题解答(FAQ)中的主要挑战之一是,信息寻求者表达其信息的术语通常与相关常见问题文件中使用的术语不同。这种词汇分歧(AKA术语不匹配)可以通过在其加权模型中依赖于关键字匹配的检索系统来导致相关常见问题解答文件的较低排名。在本文中,我们通过使用查询日志中的术语来丰富传统的常见问题解答文档表示来解决SMSBASed HIV / AIDS常见问题解答检索系统中的这种词汇差距,这些常见问题解答文档表示从查询日志中添加为基于字段的模型中的单独字段。我们使用国家卫生服务产生的常见问题解答文件的集合评估我们的方法,以及在3个月内收集的相应查询日志。我们的研究结果表明,通过从SMS查询中富集常见问题凭证,从其他相关常见问题解答文件所知并从不同领域结合术语频率,我们系统中的词汇错配问题明显缓解,导致整体改进在平均互惠级别(MRR)和召回方面的检索性能中。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号