An XML-based Wrapper Generator for Web Information Extraction

机译：用于Web信息提取的基于XML的包装器发生器

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

There has been tremendous interest in information integration systems that automatically gather, manipulate, and integrate data from multiple information sources on a user's behalf. Unfortunately, web sites are primarily designed for human browsing rather than for use by a computer program. Mechanically extracting their content is in general a rather difficult job if not impossible [4]. Software systems using such web information sources typically use hand-coded wrappers to extract information content of interest from web sources and translate query responses to a more structured format (e.g., relational form) before unifying them into an integrated answer to a user's query. The most recent generation of information mediator systems (e.g., Ariadne [3], CQ [5, 7], Internet Softbots [4], TSIMMIS [2]) addresses this problem by enabling a pre-wrapped set of web sources to be accessed via database-like queries.

机译：对信息集成系统有巨大的兴趣自动收集，操作和集成来自用户代表用户的多个信息源的数据。不幸的是，网站主要用于人类浏览而不是由计算机程序使用。如果不是不可能的情况，机械提取它们的内容通常是一个相当困难的工作[4]。使用这种Web信息源的软件系统通常使用手工编码包装器来提取来自Web源的感兴趣的信息内容，并将查询响应转换为更具结构化格式（例如，关系形式），然后统一它们进入用户查询的集成答案。最近一代信息介质系统（例如，Ariadne [3]，CQ [5,7]，Internet Softbots [4]，Tsimmis [2]）通过启用要访问的预包装的网源集来解决这个问题通过类似数据库的查询。

著录项

来源
《ACM SIGMOD International Conference on Management of Data》|1999年||共4页
会议地点
作者
Ling Liu; Wei Han; David Buttler; Calton Pu; Wei Tang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP3-532;
关键词

相似文献

外文文献
中文文献
专利

1. Entropy-based automated wrapper generation for weblog data extraction [J] . George Gkotsis, Karen Stepanyan, Alexandra I. Cristea, World Wide Web . 2014,第4期

机译：基于熵的自动包装器生成，用于Weblog数据提取
2. L-wrappers: concepts, properties and construction - A declarative approach to data extraction from web sources [J] . Badica C, Badica A, Popescu E, Soft computing: A fusion of foundations, methodologies and applications . 2007,第8期

机译：L包装器：概念，属性和构造-一种从Web来源提取数据的声明性方法
3. Why happy-wrappers festoon their webs [J] . Nora Schultz New scientist . 2008,第2650期

机译：为什么快乐包装者花彩他们的网
4. An XML-based Wrapper Generator for Web Information Extraction [C] . Ling Liu, Wei Han, David Buttler, ACM SIGMOD International Conference on Management of Data . 1999

机译：用于Web信息提取的基于XML的包装器发生器
5. Automatically constructing wrappers for effective and efficient Web information extraction. [D] . Mundluru, Dheerendranath. 2008

机译：自动构造包装器，以高效有效地提取Web信息。
6. A full XML-based approach to creating hypermedia learning modules in web-based environments: application to a pathology course [O] . Pascal Staccini, Jean-Charles Dufour, Michel Joubert, 2003

机译：在基于Web的环境中用于创建超媒体学习模块的基于XML的完整方法：应用于病理学课程
7. A Formal Comparison of Visual Web Wrapper Generators [O] . Georg Gottlob, Christoph Koch 2006

机译：Visual Web包装器生成器的形式比较

An XML-based Wrapper Generator for Web Information Extraction

摘要

著录项

相似文献

相关主题

期刊订阅