首页> 外文会议>ACM international conference on information and knowledge management >A Language Model Approach to Capture Commercial Intent and Information Relevance for Sponsored Search
【24h】

A Language Model Approach to Capture Commercial Intent and Information Relevance for Sponsored Search

机译:捕获商业意图和信息相关性以进行赞助搜索的语言模型方法

获取原文

摘要

A fundamental task of sponsored search is how to find the best match between web search queries and textual advertisements. To address this problem, we explicitly characterize the criteria for an advertisement to be a 'good match' to a query from two aspects (it should be relevant with the query from information perspective, and it should be able to capture and satisfy the commercial intent in the query). Correspondingly, we introduce in this paper a mixture language model of two parts: a commercial model which characterizes language bias of commercial intent leveraging on users' clicks on advertisements, and an informational model which is a traditional language model with consideration of the entropy of each word to capture informational relevance. We then introduce a regularized expectation-maximization (EM) algorithm model for parameters estimation, and integrate query commercial intent into the scoring function to boost overall click efficiency. Empirical evaluation shows that our model achieves better performance as compared to a well tuned classical language model and deliberated TFIDF-pLSI model (6% and 5% precision improvement at our operating point in production environment of 30% recall, and 5.3% and 6.3% AUC improvement), and performs superior to the KL Divergence language model for tail queries (0.5% nDCG improvement). Live traffic test shows over 2% CTR lift and 2.5% RPS lift as well.
机译:赞助搜索的基本任务是如何在网络搜索查询和文字广告之间找到最佳匹配。为了解决这个问题,我们从两个方面明确地将广告的标准定性为与查询的“良好匹配”(从信息的角度来看,它应该与查询相关,并且应该能够捕获并满足商业意图。在查询中)。相应地,我们在本文中介绍了一个混合的语言模型,该模型分为两个部分:一个商业模型,它利用用户对广告的点击来表征商业意图的语言偏向;以及一个信息模型,它是一种传统的语言模型,同时考虑了每个用户的熵。捕获信息相关性的单词。然后,我们引入用于参数估计的正则化期望最大化(EM)算法模型,并将查询商业意图集成到评分函数中,以提高整体点击效率。实证评估表明,与经过良好调整的古典语言模型和经过深思熟虑的TFIDF-pLSI模型相比,我们的模型具有更好的性能(在生产环境中,我们的工作点的查准率分别提高了6%和5%,召回率分别为30%和5.3%和6.3% AUC改进),并且在尾部查询方面表现优于KL Divergence语言模型(nDCG改进了0.5%)。实时流量测试显示,CTR提升超过2%,RPS提升超过2.5%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号