在微博搜索领域,单纯依赖于粉丝数量的搜索排名使刷粉行为有了可乘之机,通过将用户看作网页,将用户间的“关注”关系看作网页间的链接关系,使PageRank关于网页等级的基本思想融入到微博用户搜索,并引入一个状态转移矩阵和一个自动迭代的MapReduce工作流将计算过程并行化,进而提出一种基于MapReduce的微博用户搜索排名算法.在Hadoop平台上对该算法进行了实验分析,结果表明,该算法避免了用户排名单纯与其粉丝数量相关,使那些更具“重要性”的用户在搜索结果中的排名获得提升,提高了搜索结果的相关性和质量.%When microblog users search someone, they would like to follow by keywords. Most service providers order their results list simply depending on the scale of followers. Unfortunately, this approach gives frauds quite a few opportunities to cheat the search engine. This paper, by regarding microblog users as Web pages, and the relationship between followers as the one between Web pages that linked each other, applied the basic idea of PageRank to rank microblog users. After introducing a state-transition matrix and an auto-iterative MapReduce workflow to parallel the computation steps, this paper described a user ranking algorithm for microblog search. As shown in the experiment by using Hadoop platform, the algorithm increases the difficulty to cheat search engines, makes more important users get better rankings, and improves the relevance and quality of search results.
展开▼