首页> 外文OA文献 >A Semantic Scraping Model for Web Resources - Applying Linked Data to Web Page Screen Scraping
【2h】

A Semantic Scraping Model for Web Resources - Applying Linked Data to Web Page Screen Scraping

机译:Web资源的语义抓取模型-将链接数据应用于网页屏幕抓取

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

In spite of the increasing presence of Semantic Web Facilities, only a limited amount of the available resources in the Internet provide a semantic access. Recent initiatives such as the emerging Linked Data Web are providing semantic access to available data by porting existing resources to the semantic web using different technologies, such as database-semantic mapping and scraping. Nevertheless, existing scraping solutions are based on ad-hoc solutions complemented with graphical interfaces for speeding up the scraper development. This article proposes a generic framework for web scraping based on semantic technologies. This framework is structured in three levels: scraping services, semantic scraping model and syntactic scraping. The first level provides an interface to generic applications or intelligent agents for gathering information from the web at a high level. The second level defines a semantic RDF model of the scraping process, in order to provide a declarative approach to the scraping task. Finally, the third level provides an implementation of the RDF scraping model for specific technologies. The work has been validated in a scenario that illustrates its application to mashup technologies
机译:尽管语义Web设施的存在不断增加,但Internet中只有有限数量的可用资源提供了语义访问。诸如新兴的链接数据Web之类的最新举措正在通过使用不同的技术(例如数据库语义映射和抓取)将现有资源移植到语义Web上,从而提供对可用数据的语义访问。尽管如此,现有的刮板解决方案基于即席解决方案,并辅以图形界面,以加快刮板的开发速度。本文提出了一种基于语义技术的Web抓取通用框架。该框架分为三个级别:抓取服务,语义抓取模型和语法抓取。第一层为通用应用程序或智能代理提供了一个接口,用于从高层收集Web信息。第二层定义了刮除过程的语义RDF模型,以便为刮除任务提供声明性方法。最后,第三级提供了针对特定技术的RDF抓取模型的实现。该工作已经在一个场景中得到了验证,该场景说明了它在mashup技术中的应用

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号