首页> 外文期刊>Data & Knowledge Engineering >Automatically maintaining wrappers for semi-structured web sources
【24h】

Automatically maintaining wrappers for semi-structured web sources

机译:自动维护半结构Web源的包装器

获取原文
获取原文并翻译 | 示例
           

摘要

In order to let software programs gain full benefit from semi-structured web sources, wrapper programs must be built to provide a "machine-readable" view over them. Wrappers are able to accept a query against the source and return a set of structured results, thus enabling applications to access web data in a similar manner to that of information from databases. A significant problem in this approach arises as Web sources may undergo changes that invalidate the current wrappers. In this paper, we present novel heuristics and algorithms to address this problem. In our approach the system collects some query results during normal wrapper operation and, when the source changes, it uses them as input to generate a set of labeled examples for the source which can then be used to induce a new wrapper.
机译:为了使软件程序能够从半结构化Web资源中充分受益,必须构建包装程序以在其上提供“机器可读”视图。包装器能够接受针对源的查询并返回一组结构化结果,从而使应用程序能够以与来自数据库的信息类似的方式访问Web数据。由于Web源可能会发生使当前包装器无效的更改,因此该方法中出现了一个严重的问题。在本文中,我们提出了新颖的启发式方法和算法来解决此问题。在我们的方法中,系统在正常的包装器操作期间收集一些查询结果,并且当源发生更改时,系统将其用作输入,以为源生成一组带标签的示例,然后可用于引发新的包装器。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号