首页> 外文会议>International Conference on Digital Information Management >Distributed Web2.0 Crawling for Ontology Evolution
【24h】

Distributed Web2.0 Crawling for Ontology Evolution

机译:用于本体演变的分布式Web2.0爬行

获取原文

摘要

Semantic Web technologies in general and ontology-based approaches in particular are considered the foundation for the next generation of information services. While ontologies enable software agents to exchange knowledge and information in a standardised, intelligent manner, describing todays vast amount of information in terms of ontological knowledge and to track the evolution of such ontologies remains a challenge. In this paper we describe Web2.0 crawling for ontology evolution. The World Wide Web, or Web for short, is due, its evolutionary properties and social network characteristics a perfect fitting data source to evolve an ontology. The decentralised structure of the Internet, the huge amount of data and upcoming Web2.0 technologies arise several challenges for a crawling system. In this paper we present a distributed crawling system with standard browser integration. The proposed system is a high performance, site-script based noise reducing crawler, which loads standard browser equivalent content from Web2.0 resources. Furthermore we describe the integration of this spider into our ontology evolution framework.
机译:特别是基于本体的方法的语义网络技术被认为是下一代信息服务的基础。虽然本体能够以标准化,智能方式交换知识和信息,但在本体论知识方面描述了今天的大量信息,并跟踪此类本体的演变仍然是一个挑战。在本文中,我们描述了对本体演变的Web2.0爬行。万维网或Web短暂,即将到期,其进化属性和社交网络特征是一种完美的拟合数据源来发展本体。互联网的分散结构,大量数据和即将推出的网页2.0技术对爬行系统产生了几个挑战。本文介绍了一个具有标准浏览器集成的分布式爬网系统。所提出的系统是一种高性能,站点脚本基于噪声缩小爬虫,其从Web2.0资源加载标准浏览器等效内容。此外,我们描述了将此蜘蛛集成到我们的本体演变框架中。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号