...
首页> 外文期刊>Computer science >Efficient monitoring of personalized hot news over Web 2.0 streams
【24h】

Efficient monitoring of personalized hot news over Web 2.0 streams

机译:通过Web 2.0流有效监视个性化热点新闻

获取原文
获取原文并翻译 | 示例
           

摘要

Web 2.0 streams, like blog postings, micro-blogging tweets, or RSS feeds from online communities, offer a wealth of latest news about real-world events and societal discussion. From a user's perspective, it becomes harder and harder to get a decent overview of recent events, given these massive streams of information that are continuously flowing. Ideally, a system would continuously put together recent information, ranked by the current social impact but also weighted by the users' personal interests. In this work, we develop methods to meet these requirements. The presented approach continuously tracks the most popular tags attached to the incoming items and based on this, constructs a dynamic top-k query. By continuous evaluation of this query on the incoming stream, we are able to retrieve the currently hottest items. These hottest items are then fed into an engine that re-ranks them w.r.t. user specified interests, given in form of term based topic descriptions. This calls for high performance algorithms for efficient hot document retrieval and subsequently personalizing these documents based on user profiles, given the high rate of incoming data and the immense number of user profiles. In this work we present a combined solution, making use of our prior work on information filtering and showing how it can be used in combination with the current work, on how to continuously determine the hottest documents. To demonstrate the suitability of our approach, we perform a performance evaluation using a real-world dataset obtained from a we-blog crawl.
机译:Web 2.0流,例如博客文章,微博推文或在线社区的RSS feed,提供了大量有关真实事件和社会讨论的最新消息。从用户的角度来看,鉴于这些持续不断的海量信息流,对近期事件的全面了解变得越来越困难。理想情况下,系统将不断汇总最近的信息,这些信息按当前的社会影响进行排名,但也要根据用户的个人兴趣进行加权。在这项工作中,我们开发了满足这些要求的方法。提出的方法连续跟踪附加到传入项目的最受欢迎标签,并基于此构建动态的top-k查询。通过对传入流的此查询进行连续评估,我们能够检索当前最热的项目。然后,将这些最热门的商品输入引擎,从而对它们重新排序。用户指定的兴趣,以基于术语的主题描述形式给出。鉴于输入数据的高比率和大量用户配置文件,这需要高性能算法来进行有效的热文档检索,并随后基于用户配置文件对这些文档进行个性化。在这项工作中,我们提出了一个组合的解决方案,它利用了我们先前在信息过滤方面的工作,并展示了如何将其与当前工作结合使用,以及如何连续确定最热门的文档。为了证明我们方法的适用性,我们使用从微博抓取中获得的真实数据集执行性能评估。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号