首页> 外文会议>International Conference on Computer Science and Electronics Engineering >Mining micro-blogging users' interest features via fingerprint generation
【24h】

Mining micro-blogging users' interest features via fingerprint generation

机译:通过指纹生成挖掘微博用户的兴趣功能

获取原文
获取外文期刊封面目录资料

摘要

Nowadays, micro-blogging is widely used as a communication and information sharing social network service, therefore mining micro-blogging users' behavior features is very important both in the economic and social fields. A framework for the analysis of user's interest features is proposed in this paper. After data cleaning, word segmentation, POS (part of speech) filtering and synonym merging, the keywords that called terms of all the tweets posted by a typical user in 2011 are extracted. Then VSM (vector space model) is used to generate the feature vector of the tweets from these terms. Furthermore, a k-bit binary called fingerprint is generated from the high dimensional feature vector of the tweets by use of Simhash algorithm. The micro-blogging user's interest features and change patterns could be detected by analyzing the fingerprint sequences and the distance between the adjacent two fingerprints. Taking Sina micro-blogging as background, a series of experiments are done to prove the effectiveness of the algorithms.
机译:如今,微型博客被广泛用作共享社交网络服务的通信和信息,因此采矿微博用户的行为特征在经济和社会领域非常重要。本文提出了一种分析用户兴趣功能的框架。在数据清理后,Word Segsation,POS(词性)过滤和同义词合并,提取了2011年典型用户发布的所有推文的关键字的关键字。然后,VSM(矢量空间模型)用于从这些术语生成推文的特征向量。此外,通过使用SimHash算法,从推文的高维特征向量生成称为指纹的k位二进制。通过分析指纹序列和相邻的两个指纹之间的距离,可以检测微博用户的兴趣特征和改变模式。以新浪微博博客作为背景,完成了一系列实验以证明算法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号