首页> 外文会议>2017 International Conference on Computational Intelligence in Data Science >Twitter-user recommender system using tweets: A content-based approach
【24h】

Twitter-user recommender system using tweets: A content-based approach

机译:使用推文的Twitter用户推荐系统:一种基于内容的方法

获取原文
获取原文并翻译 | 示例

摘要

With the advent of the internet into our everyday lives, online social networks such as Facebook and Twitter have taken up a major role in networking, information deployment and entertainment. As of 2017, Twitter's outreach is over 317M monthly active users generating more than 320M tweets every day, thus making it one of the fastest information deployment mediums of this era. In order to aid data distribution without causing a glut of information to the users, we develop a recommender system focusing on a vital aspect of social media - relationships among users, addressing a popular problem of users - who to follow/befriend? By choosing the right accounts and users to follow, the sources of information can be controlled as desired. The information collected from the most recent tweets of a user is used to find other users whose recent tweets contain similar information, ensuring there is at least one mutual friend among users. By making use of the continuous and real time updating of data on social networks, we develop a method to ensure our training sets consist of relevant information for classification, thus preserving accuracy while reducing training set sizes for probabilistic learning models. We use two algorithms to detect tweets of common topics, namely a Noun Phrase detector and a Naïve Bayes Text (Topic) Classifier and further compare their complexity and accuracy. The Naive Bayes Classifier, despite being probabilistic, functioned well with a relatively small training set. This is only with the exception of Twitter as it is a real-time updating framework. Exact matches were hard to obtain with the Noun phrase detector, as we are going only one level deep due to limited compute. However, when matches were found, it is upto 90% accurate. Experiments on tweets of random public users have found that Naive Bayes Classifier with a small but recent training data set can work as well as or better than a Collaborative filter without the assumptions of the Collaborative model.
机译:随着互联网进入我们的日常生活,诸如Facebook和Twitter之类的在线社交网络在网络,信息部署和娱乐中扮演着重要角色。截至2017年,Twitter的推广活动超过每月3.17亿活跃用户,每天生成超过3.2亿条推文,因此使其成为该时代最快的信息部署媒介之一。为了在不给用户造成过多信息的情况下帮助数据分发,我们开发了一种推荐系统,重点关注社交媒体的重要方面-用户之间的关系,解决了用户的普遍问题-谁关注/结交朋友?通过选择正确的帐户和要遵循的用户,可以根据需要控制信息源。从用户的最新推文中收集的信息用于查找其他用户,这些用户的最新推文包含类似的信息,从而确保用户之间至少有一个共同的朋友。通过利用社交网络上数据的连续和实时更新,我们开发了一种方法来确保我们的训练集包含用于分类的相关信息,从而在保持准确性的同时减少了概率学习模型的训练集大小。我们使用两种算法来检测常见主题的推文,即名词短语检测器和朴素贝叶斯文本(主题)分类器,并进一步比较它们的复杂性和准确性。朴素贝叶斯分类器尽管具有概率,但在相对较小的训练集下仍能正常运行。这仅是Twitter的例外,因为它是实时更新框架。使用名词短语检测器很难获得精确的匹配,因为由于计算量有限,我们只能深入一层。但是,找到匹配项后,它的准确率高达90%。在随机公共用户的推文上进行的实验发现,在没有协作模型假设的情况下,具有较小但最近的训练数据集的朴素贝叶斯分类器可以比协作过滤器更好或更好地工作。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号