...
首页> 外文期刊>Data Science and Engineering >Unsupervised Qualitative Scoring for Binary Item Features
【24h】

Unsupervised Qualitative Scoring for Binary Item Features

机译:二进制项目功能的无监督定性评分

获取原文

摘要

Binary features, such as categories, keywords, or tags, are widely used to describe product properties. However, these features are incomplete in that they do not contain several aspects of numerical information. The qualitative score of tags is widely used to describe which product is better in terms of the given property. For example, in a restaurant navigation site, properties such as mood, dishes, and location are given in the form of numerical values, representing the goodness of each aspect. In this paper, we propose a novel approach to estimate the qualitative score from the binary features of products. Based on a natural assumption that an item with a better property is more popular among users who prefer that property, in short, “experts know best,” we introduce both discriminative and generative models with which user preferences and item qualitative scores are inferred from user--item interactions. We constrain the space of the item qualitative score by item binary features so that the score of each item and tag can only have nonzero values when the item has the corresponding tag. This approach contributes to resolving the following difficulties: (1) no supervised data for the score estimation, (2) implicit user purpose, and (3) irrelevant tag contamination. We evaluate our models by using two artificial datasets and two real-world datasets of movie and book ratings. In the experiment, we evaluate the performances of our model under sparse transaction and noisy tag settings by using two artificial datasets. We also evaluate our models’ resolution for irrelevant tags using the real-world dataset of movie ratings and observe that our models outperform a baseline model. Finally, tag rankings obtained from the real-world datasets are compared with a baseline model.
机译:二进制特征,例如类别,关键字或标签,广泛用于描述产品属性。然而,这些特征是不完整的,因为它们不包含数值信息的几个方面。标签的定性得分被广泛用于描述给定财产的产品更好。例如,在餐厅导航站点中,诸如情绪,菜肴和位置等性质以数值的形式给出,代表每个方面的良好。在本文中,我们提出了一种新的方法来估计产品二元特征的定性分数。基于一个自然假设,其中一个具有更好的物业的项目在更倾向于该财产的用户中更受欢迎,简而言之,“专家了解最好”,我们介绍了与用户偏好和项目定性分数的判别和生成模型都是从用户推断出来的 - 互动。我们限制了项目二进制特征的项目定性得分的空间,以便当项目具有相应的标记时,每个项目和标记的分数只能具有非零值。这种方法有助于解决以下困难:(1)没有监督数据的分数估计,(2)隐式用户目的,(3)无关标签污染。我们通过使用两个人工数据集和两个真实世界的电影和预订评级数据集来评估我们的模型。在实验中,我们使用两个人工数据集评估稀疏事务和嘈杂标签设置下我们的模型的性能。我们还使用电影评级的真实世界数据集来评估我们的模型对无关标签的解决方案,并观察到我们的模型优于基线模型。最后,将从真实数据集获得的标签排名与基线模型进行比较。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号