首页> 外文会议>International conference on NetObjectDays >Toolkits for Generating Wrappers A Survey of Software Toolkits for Automated Data Extraction from Web Sites
【24h】

Toolkits for Generating Wrappers A Survey of Software Toolkits for Automated Data Extraction from Web Sites

机译:用于生成包装器的工具包对网站自动数据提取的软件工具包的调查

获取原文

摘要

Various web applications in e-business, such as online price comparisons, competition monitoring and personalised newsletters require retrieval of distributed information from the Internet. This paper examines the suitability of software toolkits for the extraction of data from web sites. The term wrapper is defined and an overview of presently available toolkits for generating wrappers is provided. In order to give a better insight into the workings of such toolkits, a detailed analysis of the non-commercial software program LAPIS is presented. An example application using this toolkit demonstrates how acceptable results can be achieved with relative ease. The functionality of the program is compared with the functionality of the commercial toolkit RoboMaker and the differences are highlighted. With the aim of providing improved ease-of-use and faster wrapper generation in mind, possible areas for further development of toolkits for automated web data extraction are discussed.
机译:电子商务中的各种Web应用程序,例如在线价格比较,竞争监测和个性化通讯需要从互联网中检索分布式信息。本文介绍了软件工具包从网站提取数据的适用性。术语包装器定义,并提供了用于生成包装器的目前可用的工具包的概述。为了更好地了解此类工具包的工作,提出了对非商业软件程序LAPI的详细分析。使用此工具包的示例应用程序演示了如何通过相对容易实现可接受的结果。将程序的功能与商业工具包重新制作机器人的功能进行了比较,并且突出显示差异。讨论了提高易用性和更快的包装器,可以讨论用于进一步开发用于自动Web数据提取的工具包的可能领域。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号