首页> 外文会议>ACM SIGMOD international conference on Management of data >A layered architecture for querying dynamic Web content
【24h】

A layered architecture for querying dynamic Web content

机译:用于查询动态Web内容的分层体系结构

获取原文

摘要

The design of webbases, database systems for supporting Web-based applications, is currently an active area of research. In this paper, we propose a 3-year architecture for designing and implementing webbases for querying dynamic Web content(i.e., data that can only be extracted by filling out multiple forms). The lowest layer, virtual physical layer, provides navigation independence by shielding the user from the complexities associated with retrieving data from raw Web sources. Next, the traditional logical layer supports site independence. The top layer is analogous to the external schema layer in traditional databases.

Within this architectural framework we address two problems unique to webbases --- retrieving dynamic Web content in the virtual physical layer and querying of the external schema by the end user. The layered architecture makes it possible to automate data extraction toa much greater degree than in existing proposals. Wrappers for the virtual physical schema can be created semi-automatically, by asking the webbase designer to navigate through the sites of interest --- we call this approach mapping by example. Thus, the webbase designer need not have expertise in the language that maps the physical schema to the raw Web (this should be contrasted to other approaches, which require expertise in various Web-enabled flavors of SQL). For the external schema layer, we propose a semantic extension of the universal relation interface. This interface provides powerful, yet reasonably simple, ad hoc querying capabilities for the end user compared to the currently prevailing "canned" form-based interfaces on the one hand or complex Web-enabling extensions of SQL on the other. Finally, we discuss the implementation of the proposed architecture.

机译:

网站库(用于支持基于Web的应用程序的数据库系统)的设计目前是一个活跃的研究领域。在本文中,我们提出了一种为期三年的体系结构,用于设计和实现用于查询动态Web内容(即只能通过填写多种形式提取的数据)的Webbases。最底层的虚拟物理层通过保护用户免受与从原始Web来源检索数据相关的复杂性的影响,提供了导航独立性。接下来,传统的逻辑层支持站点独立性。顶层类似于传统数据库中的外部架构层

在此体系结构框架中,我们解决了Webbase独有的两个问题-检索虚拟物理层中的动态Web内容以及最终用户查询外部架构。与现有提案相比,分层体系结构可以使数据提取自动化程度更高。通过要求Web设计人员在感兴趣的站点中导航,可以半自动创建虚拟物理模式的包装程序-我们将这种方法称为示例映射。因此,Webbase设计人员不需要在将物理模式映射到原始Web的语言方面具有专业知识(这应与其他方法形成鲜明对比,其他方法需要在各种支持Web的SQL版本中的专业知识)。对于外部模式层,我们提出了通用关系接口的语义扩展。与当前流行的基于“罐头”基于表单的界面或另一方面与SQL的复杂的支持Web的扩展相比,该界面为最终用户提供了强大而合理的简单的即席查询功能。最后,我们讨论了所提出的体系结构的实现。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号