首页> 外文期刊>Journal of ambient intelligence and humanized computing >Micro-blog in China: identify influential users and automatically classify posts on Sina micro-blog
【24h】

Micro-blog in China: identify influential users and automatically classify posts on Sina micro-blog

机译:中国微博:识别有影响力的用户并自动对新浪微博上的帖子进行分类

获取原文
获取原文并翻译 | 示例
       

摘要

Sina micro-blog (Weibo) is the first micro-blogging service in China and is growing fast in recent two years. This paper first studies the characteristics of Sina online social network and then focuses on the problem of indentifying influential users and automatic micro-blog classification. In a dataset prepared for this study, we find an approximate power-law follower distribution and a non-power-law friend distribution, a log correlation between follower number and tweet number, etc. In order to find the most popular users, we propose our algorithm called XinRank and compare it with the other two algorithms. The result shows that XinRank is different and it offers a new perspective for people to find influential users. In addition, our algorithm is dynamic and stability, which is special and better than the other two algorithms. We attempt to automatically classify a single Chinese micro-blog post into a set of high-level categories using a naive Bayes classifier. Our research indicates that even though an average micro-blogging post in Chinese is only 28 words in length, they can be categorized into one of eight categories with an average performance up to 84.2 %, using our proposed process. We try to address the automatic user interest discovery problem at the end of this paper. And finally, we combine XinRank and our micro-blog classifier to propose an interest-based influence ranking model.
机译:新浪微博(微博)是中国第一家微博服务,近两年来发展迅速。本文首先研究了新浪在线社交网络的特点,然后重点研究了识别有影响力的用户和自动微博分类的问题。在为本研究准备的数据集中,我们找到了近似的幂律跟随者分布和非幂律朋友分布,追随者数量和推文数量之间的对数相关性。为了找到最受欢迎的用户,我们建议我们的算法称为XinRank,并将其与其他两种算法进行比较。结果表明,XinRank与众不同,为人们找到有影响力的用户提供了新的视角。另外,我们的算法是动态和稳定的,这是特殊的,并且比其他两种算法更好。我们尝试使用朴素的贝叶斯分类器将单个中文微博帖子自动分类为一组高级类别。我们的研究表明,即使中文的平均微博帖子长度只有28个字,使用我们建议的流程,它们也可以分为八类之一,平均表现高达84.2%。我们在本文结尾处尝试解决自动用户兴趣发现问题。最后,我们结合XinRank和我们的微博客分类器,提出了一种基于兴趣的影响力排名模型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号