首页> 外文会议>IEEE International Conference on Data Mining Workshops >Generalized Learning of Neural Network Based Semantic Similarity Models and Its Application in Movie Search
【24h】

Generalized Learning of Neural Network Based Semantic Similarity Models and Its Application in Movie Search

机译:基于神经网络的语义相似模型的广义学习及其在电影搜索中的应用

获取原文

摘要

Modeling text semantic similarity via neural network approaches has significantly improved performance on a set of information retrieval tasks in recent studies. However these neural network based latent semantic models are mostly trained by using simple user behavior logging data such as clicked (query, document)-pairs, and all the clicked pairs are assumed to be uniformly positive examples. Therefore, the existing method for learning the model parameters does not differentiate data samples that might reflect different relevance information. In this paper, we relax this assumption and propose a new learning method through a generalized loss function to capture the subtle relevance differences of training samples when a more granular label structure is available. We have applied it to the Xbox One's movie search task where session-based user behavior information is available and the granular relevance differences of training samples are derived from the session logs. Compared with the existing method, our new generalized loss function has demonstrated superior test performance measured by several user-engagement metrics. It also yields significant performance lift when the score computed from our model is used as a semantic similarity feature in the gradient boosted decision tree model which is widely used in modern search engines.
机译:在最近的研究中,通过神经网络方法对文本语义相似性进行建模已显着提高了一组信息检索任务的性能。但是,这些基于神经网络的潜在语义模型主要是通过使用简单的用户行为日志记录数据(例如,单击(查询,文档)对)进行训练的,并且所有单击对均假定为一致的正例。因此,用于学习模型参数的现有方法不会区分可能反映不同相关信息的数据样本。在本文中,我们放宽了这一假设,并通过广义损失函数提出了一种新的学习方法,以在更细粒度的标签结构可用时捕获训练样本的细微相关性差异。我们已将其应用于Xbox One的电影搜索任务,在该任务中可以使用基于会话的用户行为信息,并且可以从会话日志中得出训练样本的细粒度相关性差异。与现有方法相比,我们新的广义损失函数通过多个用户参与度指标显示出了卓越的测试性能。当从我们的模型计算出的分数用作梯度增强决策树模型中的语义相似性特征时,它也产生了显着的性能提升,该模型在现代搜索引擎中得到了广泛使用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号