首页> 外文期刊>Future Internet >The ARCOMEM Architecture for Social- and Semantic-Driven Web Archiving
【24h】

The ARCOMEM Architecture for Social- and Semantic-Driven Web Archiving

机译:用于社交和语义驱动的Web归档的ARCOMEM架构

获取原文
       

摘要

The constantly growing amount of Web content and the success of the Social Web lead to increasing needs for Web archiving. These needs go beyond the pure preservation of Web pages. Web archives are turning into “community memories” that aim at building a better understanding of the public view on, e.g., celebrities, court decisions and other events. Due to the size of the Web, the traditional “collect-all” strategy is in many cases not the best method to build Web archives. In this paper, we present the ARCOMEM (From Collect-All Archives to Community Memories) architecture and implementation that uses semantic information, such as entities, topics and events, complemented with information from the Social Web to guide a novel Web crawler. The resulting archives are automatically enriched with semantic meta-information to ease the access and allow retrieval based on conditions that involve high-level concepts.
机译:Web内容的不断增长和Social Web的成功导致对Web归档的需求不断增长。这些需求超出了仅保存Web页面的范围。网络档案馆正在变成“社区记忆”,其目的是更好地了解公众对名人,法院判决和其他事件的看法。由于Web的规模,在许多情况下,传统的“全部收集”策略不是构建Web存档的最佳方法。在本文中,我们介绍了ARCOMEM(从收集所有档案到社区记忆)体系结构和实现,该体系结构和实现使用语义信息(例如实体,主题和事件),并结合来自Social Web的信息来指导新型Web爬虫。生成的档案会自动添加语义元信息,以简化访问并允许基于涉及高级概念的条件进行检索。

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号