首页> 外文会议>ACM international conference on information and knowledge management >Improving Context-Aware Query Classification via Adaptive Self-training
【24h】

Improving Context-Aware Query Classification via Adaptive Self-training

机译:通过自适应自我培训改进上下文感知查询分类

获取原文

摘要

Topical classification of user queries is critical for general-purpose web search systems. It is also a challenging task, due to the sparsity of query terms and the lack of labeled queries. On the other hand, search contexts embedded in query sessions and unlabeled queries free on the web have not been fully utilized in most query classification systems. In this work, we leverage these information to improve query classification accuracy. We first incorporate search contexts into our framework using a Conditional Random Field (CRF) model. Discriminative training of CRFs is favored over the traditional maximum likelihood training because of its robustness to noise. We then adapt self-training with our model to exploit the information in unlabeled queries. By investigating different confidence measurements and model selection strategies, we effectively avoid the error-reinforcing nature of self-training. In extensive experiments on real search logs, we have averaged around 20% improvement in classification accuracy over other state-of-the-art baselines.
机译:用户查询的局部分类对于通用网络搜索系统至关重要。由于查询术语的稀缺性和缺乏标记查询,这也是一个具有挑战性的任务。另一方面,在大多数查询分类系统中尚未充分利用嵌入在网上查询会话和未标记查询中的搜索上下文。在这项工作中,我们利用这些信息来提高查询分类准确性。我们首先使用条件随机字段(CRF)模型将搜索上下文纳入我们的框架。由于其对噪声的稳健性,对CRF的歧视性培训受到传统最大可能性培训。然后,我们使用模型进行自我培训,以利用未标记查询中的信息。通过调查不同的置信度量和模型选择策略,我们有效地避免了自我训练的错误增强性质。在实验的实验实验中,我们在其他最先进的基线上平均分类准确性提高约20%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号