【24h】

A Probabilistic Model for Personalized Tag Prediction

机译:个性化标签预测的概率模型

获取原文

摘要

Social tagging systems have become increasingly popular for sharing and organizing web resources. Tag recommendation is a common feature of social tagging systems. Social tagging by nature is an incremental process, meaning that once a user has saved a web page with tags, the tagging system can provide more accurate predictions for the user, based on the user's incremental behavior. However, existing tag prediction methods do not consider this important factor, in which their training and test datasets are either split by a fixed time stamp or randomly sampled from a larger corpus. In our temporal experiments, we perform a time-sensitive sampling on an existing public dataset, resulting in a new scenario which is much closer to "real-world".In this paper, we address the problem of tag prediction by proposing a probabilistic model for personalized tag prediction. The model is a Bayesian approach, and integrates three factors— an ego-centric effect, environmental effects and web page content. Two methods—both intuitive calculation and learning optimization—are provided for parameter estimation. Pure graph-based methods which may have significant constraints (such as every user, every item and every tag has to occur in at least p posts) cannot make a prediction in most "real world" cases while our model improves the F-measure by over 30% compared to a leading algorithm on a publicly-available real-world dataset.
机译:社交标签系统在共享和组织Web资源方面变得越来越流行。标签推荐是社交标签系统的共同特征。社交标签本质上是一个增量过程,这意味着一旦用户保存了带有标签的网页,标签系统便可以基于用户的增量行为为用户提供更准确的预测。但是,现有的标签预测方法没有考虑到这一重要因素,因为它们的训练和测试数据集要么按固定的时间戳划分,要么从较大的语料库中随机采样。在我们的时间实验中,我们对现有的公共数据集执行时间敏感的采样,从而产生了一种更接近“真实世界”的新场景。 在本文中,我们通过提出用于个性化标签预测的概率模型来解决标签预测的问题。该模型是贝叶斯方法,集成了三个因素-以自我为中心的效果,环境效果和网页内容。提供了两种方法-直观计算和学习优化-用于参数估计。基于纯图形的方法可能会有很大的约束(例如每个用户,每个项目和每个标签必须至少出现在p个帖子中),在大多数“真实世界”情况下无法做出预测,而我们的模型通过以下方法改进了F量度:与公开可用的现实世界数据集上的领先算法相比,超过30%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号