【24h】

A Shopping Agent That Automatically Constructs Wrappers for Semi-Structured Online Vendors

机译:一个为半结构化在线供应商自动构建包装的购物代理

获取原文
获取原文并翻译 | 示例

摘要

This paper proposes a shopping agent with a robust inductive learning method that automatically constructs wrappers for semi-structured online stores. Strong biases assumed in many existing systems are weakened so that the real stores with reasonably complex document structures can be handled. Our method treats a logical line as a basic unit, and recognizes the position and the structure of product descriptions by finding the most frequent pattern from the sequence of logical line information in output HTML pages. This method is capable of analyzing product descriptions that comprise multiple logical lines, and even those with extra or missing attributes. Experimental tests on over 60 sites show that it successfully constructs correct wrappers for most real stores.
机译:本文提出了一种具有鲁棒的归纳学习方法的购物代理商,该方法可以自动为半结构化在线商店构造包装器。在许多现有系统中假定的强烈偏见被削弱,因此可以处理具有相当复杂的文档结构的真实商店。我们的方法将逻辑行作为基本单位,并通过从输出HTML页面中的逻辑行信息序列中找到最频繁的模式来识别产品描述的位置和结构。此方法能够分析包含多个逻辑行的产品描述,甚至具有额外或缺失属性的产品描述。在60多个站点上进行的实验测试表明,它成功地为大多数真实商店构建了正确的包装器。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号