News Page Discovery Policy for Instant Crawlers

机译：即时搜寻者的新闻页面发现政策

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Many news pages which are of high freshness requirements are published on the internet every day. They should be downloaded immediately by instant crawlers. Otherwise, they will become outdated soon. In the past, instant crawlers only downloaded pages from a manually generated news website list. Bandwidth is wasted in downloading non-news pages because news websites do not publish news pages exclusively. In this paper, a novel approach is proposed to discover news pages. This approach includes seed selection and news URL prediction based on user behavior analysis. Empirical studies in a user access log for two months show that our approach outperforms the traditional approach in both precision and recall.

机译：每天都在互联网上发布许多对新鲜度要求很高的新闻页面。即时搜寻器应立即下载它们。否则，它们将很快过时。过去，即时搜寻器仅从手动生成的新闻网站列表中下载页面。带宽浪费在下载非新闻页面上，因为新闻网站不专门发布新闻页面。本文提出了一种新颖的发现新闻页面的方法。该方法包括基于用户行为分析的种子选择和新闻URL预测。在两个月的用户访问日志中的经验研究表明，我们的方法在准确性和召回率方面均优于传统方法。

著录项

来源
《Information Retrieval Technology》|2008年|P.520-525|共6页
会议地点
作者
Yong Wang; Yiqun Liu; Min Zhang; Shaoping Ma;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算机设备安全;
关键词
web log; user behavior analysis; news page discovery;

机译：网络日志;用户行为分析;新闻页面发现;

相似文献

外文文献
中文文献
专利

1. Research and Construction of the Online Pesticide Information Center and Discovery Platform Based on Web Crawler [J] . Tian Fang, Tan Han, Cheng Zhang, Procedia Computer Science . 2020,第5期

机译：基于Web履带的在线农药信息中心和探索平台的研究与构建
2. FC4CD: a new SOA-based Focused Crawler for Cloud service Discovery [J] . Boukadil Khouloud, Rekik Mouna, Rekik Molka, Computing . 2018,第10期

机译：FC4CD：基于SOA的新型针对云服务发现的Focused Crawler
3. Self-Adaptive Semantic Focused Crawler for Mining Services Information Discovery [J] . Dong H., Hussain F.K. IEEE transactions on industrial informatics . 2014,第2期

机译：用于采矿服务信息发现的自适应语义爬虫
4. News Page Discovery Policy for Instant Crawlers [C] . Yong Wang, Yiqun Liu, Min Zhang, Asia Information Retrieval Symposium . 2008

机译：Instant爬网程序的新闻页面发现策略
5. Towards a re-discovery of the public sphere: Myanmar/Burma's 'exile' media's counter-hegemonic potential and the U.S. news media's re-framing of American foreign policy. [D] . Labb, Brett R. 2016

机译：试图重新发现公共领域：缅甸/缅甸的“流亡”媒体具有反霸权的潜力，而美国新闻媒体则重新构筑了美国的外交政策。
6. Optimization of communication in the surgical program via instant messaging Web-based surveys newsletters websites smartphones and telemedicine: the experience of Oakville Trafalgar Memorial Hospital [O] . Duncan Rozario 2018

机译：通过即时消息传递基于Web的调查新闻通讯网站智能手机和远程医疗优化手术程序中的通信：Oakville Trafalgar Memorial Hospital的体验
7. Real Time News Analysis for Improved Social Relationship Discovery; Conference paper [R] . Forester, J., O'May, J. 2007

机译：改善社会关系发现的实时新闻分析;会议文件

News Page Discovery Policy for Instant Crawlers

摘要

著录项

相似文献

相关主题

期刊订阅