首页> 外文会议>International conference on computational linguistics;COLING 2010 >Identifying and Ranking Topic Clusters in the Blogosphere
【24h】

Identifying and Ranking Topic Clusters in the Blogosphere

机译:标识和排序Blogosphere中的主题聚类

获取原文

摘要

The blogosphere is a huge collaboratively constructed resource containing diverse and rich information. This diversity and richness presents a significant research challenge to the Information Retrieval community. This paper addresses this challenge by proposing a method for identification of “topic clusters” within the blogosphere where topic clusters represent the concept of grouping together blogs sharing a common interest i.e. topic, the algorithm takes into account both the hyperlinked social network of blogs along with the content in the blog posts. Additionally we use various forms and parts-of-speech of the topic to provide a broader coverage of the blogosphere. The next step of the method is to assign topic-specific ranks to each blog in the cluster using a metric called “Topic Discussion Rank,” that helps in identifying the most influential blog for a specific topic. We also perform an experimental evaluation of our method on real blog data and show that the proposed method reaches a high level of accuracy.
机译:博客圈是一个巨大的协作构建的资源,其中包含各种丰富的信息。这种多样性和丰富性给信息检索社区带来了重大的研究挑战。本文通过提出一种在博客圈中标识“主题簇”的方法来解决这一挑战,其中主题簇代表了将具有共同兴趣的博客分组在一起的概念,即主题,该算法将博客的超链接社交网络以及博客文章中的内容。另外,我们使用主题的各种形式和词性来提供对Blogosphere的更广泛的覆盖。该方法的下一步是使用称为“主题讨论等级”的度量为集群中的每个博客分配主题特定的排名,该度量有助于识别特定主题最具影响力的博客。我们还对真实博客数据执行了我们的方法的实验评估,结果表明所提出的方法达到了很高的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号