首页> 外文会议>CIKM 10;ACM conference on information and knowledge management >The Gist of Everything New: Personalized Top-k Processing over Web 2.0 Streams
【24h】

The Gist of Everything New: Personalized Top-k Processing over Web 2.0 Streams

机译:新事物的要点:基于Web 2.0流的个性化Top-k处理

获取原文

摘要

Web 2.0 portals have made content generation easier than ever with millions of users contributing news stories in form of posts in weblogs or short textual snippets as in Twitter. Efficient and effective filtering solutions are key to allow users stay tuned to this ever-growing ocean of information, releasing only relevant trickles of personal interest. In classical information filtering systems, user interests are formulated using standard IR techniques and data from all available information sources is filtered based on a predefined absolute quality-based threshold. In contrast to this restrictive approach which may still overwhelm the user with the returned stream of data, we envision a system which continuously keeps the user updated with only the top-fc relevant new information. Freshness of data is guaranteed by considering it valid for a particular time interval, controlled by a sliding window. Considering relevance as relative to the existing pool of new information creates a highly dynamic setting. We present POL-filter which together with our maintenance module constitute an efficient solution to this kind of problem. We show by comprehensive performance evaluations using real world data, obtained from a weblog crawl, that our approach brings performance gains compared to state-of-the-art.
机译:Web 2.0门户使内容的生成比以往任何时候都容易,数百万用户以Web日志中的帖子或Twitter中的短文本片段的形式发布新闻报道。高效有效的过滤解决方案是使用户时刻关注不断增长的信息海洋的关键,只释放个人兴趣的相关relevant滴。在经典信息过滤系统中,使用标准IR技术来制定用户兴趣,并基于预定义的基于绝对质量的阈值对来自所有可用信息源的数据进行过滤。与这种限制性方法可能仍会使返回的数据流使用户不知所措的情况相反,我们设想了一个系统,该系统将仅用头几个fc相关的新信息来不断使用户保持最新状态。通过认为数据在特定时间间隔内有效(由滑动窗口控制),可以保证数据的新鲜度。将相关性视为相对于现有的新信息库而言,会创建一个高度动态的设置。我们介绍了POL过滤器,它与我们的维护模块一起构成了针对此类问题的有效解决方案。通过使用从Weblog爬网获得的真实数据进行的全面性能评估,我们表明,与最新技术相比,我们的方法可带来性能提升。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号