首页> 外文会议>2013 8th Annual ChinaGrid Conference >An Incremental Crawler for Web Video Based on Content Longevity
【24h】

An Incremental Crawler for Web Video Based on Content Longevity

机译:基于内容寿命的网络视频增量爬虫

获取原文
获取原文并翻译 | 示例

摘要

The explosive growth of online videos is crucial to the development of video search engines. Search engines use crawlers to retrieve pages and then discover new ones by extracting the pages' outgoing links. However, the ephemeral and persistent content which are distinguished by the web crawlers are also exist on the online video pages and are rarely noticed by video search engines. Based on this observation, we characterize the longevity of content found on the video pages and develop an incremental crawler. In the crawling policy, a practical meaningful method to estimate utility threshold is given. As we show via experiments over real web data, our refresh policy obtain better freshness at lower cost, compared with previous approaches.
机译:在线视频的爆炸性增长对于视频搜索引擎的发展至关重要。搜索引擎使用搜寻器来检索页面,然后通过提取页面的传出链接来发现新的页面。但是,通过网络搜寻器区分的短暂内容和持久性内容也存在于在线视频页面上,并且很少被视频搜索引擎注意到。基于此观察,我们表征了在视频页面上找到的内容的寿命,并开发了增量爬网程序。在爬行策略中,给出了一种实用的有意义的估计效用阈值的方法。正如我们通过对真实Web数据的实验所显示的,与以前的方法相比,我们的刷新策略以更低的成本获得了更好的新鲜度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号