【24h】

Querying Websites Using Compact Skeletons

机译:使用紧凑型骨架查询网站

获取原文
获取原文并翻译 | 示例

摘要

Several commercial applications, such as online comparison shopping and process automation, require integrating information that is scattered across multiple websites or XML documents. Much research has been devoted to this problem, resulting in several research prototypes and commercial implementations. Such systems rely on wrappers that provide relational or other structured interfaces to websites. Traditionally, wrappers have been constructed by hand on a per-website basis, constraining the scalability of the system. We introduce a website structure inference mechanism called compact skeletons that is a step in the direction of automated wrapper generation. Compact skeletons provide a transformation from websites or other hierarchical data, such as XML documents, to relational tables. We study several classes of compact skeletons and provide polynomial-time algorithms and heuristics for automated construction of compact skeletons from websites. Experimental results show that our heuristics work well in practice. We also argue that compact skeletons are a natural extension of commercially deployed techniques for wrapper construction.
机译:一些商业应用程序,例如在线比较购物和流程自动化,需要集成分散在多个网站或XML文档中的信息。已经对该问题进行了大量研究,从而产生了一些研究原型和商业实现。这样的系统依赖于提供网站关系或其他结构化接口的包装器。传统上,包装器是在每个网站上手动构建的,从而限制了系统的可伸缩性。我们介绍了一种称为紧凑框架的网站结构推断机制,这是朝自动包装器生成方向迈出的一步。紧凑的框架提供了从网站或其他分层数据(例如XML文档)到关系表的转换。我们研究了几类紧凑骨架,并提供了多项时间算法和启发式方法,用于从网站自动构建紧凑骨架。实验结果表明,我们的启发式方法在实践中效果很好。我们还认为,紧凑的骨架是用于包装构造的商业部署技术的自然延伸。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号