Blog feed search with a post index

Wouter Weerkamp; Krisztian Balog; Maarten de Rijke

首页> 外文期刊>Information retrieval >Blog feed search with a post index

【24h】

Blog feed search with a post index

机译：带帖子索引的博客供稿搜索

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

User generated content forms an important domain for mining knowledge. In this paper, we address the task of blog feed search: to find blogs that are principally devoted to a given topic, as opposed to blogs that merely happen to mention the topic in passing. The large number of blogs makes the blogosphere a challenging domain, both in terms of effectiveness and of storage and retrieval efficiency. We examine the effectiveness of an approach to blog feed search that is based on individual posts as indexing units (instead of full blogs). Working in the setting of a probabilistic language modeling approach to information retrieval, we model the blog feed search task by aggregating over a blogger's posts to collect evidence of relevance to the topic and persistence of interest in the topic. This approach achieves state-of-the-art performance in terms of effectiveness. We then introduce a two-stage model where a pre-selection of candidate blogs is followed by a ranking step. The model integrates aggressive pruning techniques as well as very lean representations of the contents of blog posts, resulting in substantial gains in efficiency while maintaining effectiveness at a very competitive level.

机译：用户生成的内容构成了挖掘知识的重要领域。在本文中，我们解决了博客供稿搜索的任务：查找主要致力于给定主题的博客，而不是仅仅偶然提及该主题的博客。无论是从有效性还是在存储和检索效率方面，大量的博客使Blogosphere成为具有挑战性的领域。我们检查基于单个帖子作为索引单位（而不是完整博客）的博客提要搜索方法的有效性。在设置一种用于信息检索的概率语言建模方法的过程中，我们通过汇总博客作者的帖子以收集与该主题相关的主题以及对该主题的兴趣持续存在的证据，来对博客摘要搜索任务进行建模。就有效性而言，此方法可实现最先进的性能。然后，我们引入一个两阶段模型，其中候选博客的预选之后是排名步骤。该模型集成了积极的修剪技术以及博客文章内容的精益表示形式，从而在提高效率的同时大幅提高了效率，同时保持了非常竞争的水平。

著录项

来源
《Information retrieval》 |2011年第5期|p.515-545|共31页
作者
Wouter Weerkamp; Krisztian Balog; Maarten de Rijke;
展开▼
作者单位

ISLA, University of Amsterdam, Amsterdam, The Netherlands;

ISLA, University of Amsterdam, Amsterdam, The Netherlands;

ISLA, University of Amsterdam, Amsterdam, The Netherlands;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
blog feed search; post-level indexing; efficiency; generative language models; associations;

机译：博客供稿搜索;后期索引效率;生成语言模型;协会;

相似文献

外文文献
中文文献
专利

1. Exploring the Relationship between Keywords and Feed Elements in Blog Post Search [J] . Seung-Kyun Han, Dongmin Shin, Jae-Yoon Jung, World Wide Web . 2009,第4期

机译：探索博客文章搜索中关键字与提要元素之间的关系
2. Based on The Document-Link and Time-Clue Relationships Between Blog Posts to Improve the Performance of Google Blog Search [J] . Chen Lin-Chih International journal on Semantic Web and information systems . 2019,第1期

机译：基于博客帖子之间的文档链接和时间线条关系，提高Google博客搜索的性能
3. A data-centric approach to feed search in blogs [J] . Flora S. Tsai International Journal of Web Engineering and Technology . 2012,第3期

机译：以数据为中心的博客提要搜索方法
4. An Improved Feedback Approach Using Relevant Local Posts for Blog Feed Retrieval [C] . Yeha Lee, Seung-Hoon Na, Jqng-Hyeok Lee 18th ACM conference on information and knowledge management 2009 . 2009

机译：一种改进的使用相关本地帖子进行博客提要的反馈方法
5. A content analysis: Exploring parents' discourse about bullying as posted on blogs. [D] . Harvey, Keri Lyn. 2009

机译：内容分析：浏览父母在博客上发布的有关欺凌的论述。
6. Seeking virtual social support through blogging: A content analysis of published blog posts written by people with chronic pain [O] . Samuel Tsai, Emma Crawford, Jenny Strong 2018

机译：通过博客寻求虚拟社会支持：对慢性疼痛患者撰写的已发布博客文章的内容分析
7. Blog feed search with a post index [O] . Wouter Weerkamp, Krisztian Balog, Maarten de Rijke 2011

机译：带帖子索引的博客供稿搜索
8. KLE at TREC 2008 Blog Track: Blog Post and Feed Retrieval [R] . Lee, Y., Na, S., Kim, J., 2008

机译：KLE在TREC 2008博客跟踪：博客帖子和Feed检索

Blog feed search with a post index

摘要

著录项

相似文献

相关主题

期刊订阅