Web Data Extraction Based on Visual Information and Partial Tree Alignment

机译：基于视觉信息和局部树对齐的Web数据提取

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Web databases contain a huge amount of structured data which are easily obtained via their query interfaces only. The query results are presented in dynamically generated web pages, usually in the form of data records, for human use. The automatical web data extraction is critical in web integration. A number of approaches have been proposed. The early work are most based on the source code or the tag tree of the page. Recent approaches use the visual feature to extract data information, which are better than the previous work. However, these approaches still have inherent limitation. In this paper, we propose a novel approach that make use of visual features to extract data information from web page, including the data records and the data items. The results of this experiment tests on a large set of query result pages in different domain show that the proposed approach is highly effective.

机译：Web数据库包含大量结构化数据，这些结构化数据仅通过查询接口即可轻松获得。查询结果通常以数据记录的形式呈现在动态生成的网页中，以供人类使用。自动Web数据提取对于Web集成至关重要。已经提出了许多方法。早期的工作主要基于页面的源代码或标签树。最近的方法使用视觉功能来提取数据信息，这比以前的工作要好。但是，这些方法仍然具有固有的局限性。在本文中，我们提出了一种新颖的方法，该方法利用视觉特征从网页中提取数据信息，包括数据记录和数据项。在不同领域的大量查询结果页面上进行的实验测试结果表明，该方法非常有效。

著录项

来源
《Web Information System and Application Conference》|2014年|18-23|共6页
会议地点
作者
Siwu Fan; Xinjun Wang; Yongquan Dong;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Web data extraction; Web mining; Wrapper generation;

机译：Web数据提取; Web挖掘;包装器生成;

相似文献

外文文献
中文文献
专利

1. Structured Data Extraction from the Web Based on Partial Tree Alignment [J] . Yanhong Zhai, Bing Liu IEEE Transactions on Knowledge and Data Engineering . 2006,第期

机译：基于部分树对齐的Web结构化数据提取
2. T-BAS: Tree-Based Alignment Selector toolkit for phylogenetic-based placement, alignment downloads and metadata visualization: an example with the Pezizomycotina tree of life [J] . Bioinformatics . 2017,第8期

机译：T-BAS：基于树的对齐选择器工具包，用于系统发育的展示位置，对准下载和元数据可视化：具有培养Zizomycotina生活树的示例
3. Multi Level Web Data Extraction Based Topical Visual Structure Clustering for Efficient Web Search [J] . Sureshkumar T, Shanthi N Journal of computational and theoretical nanoscience . 2017,第9期

机译：基于多级Web数据提取的高效网络搜索的局部视觉结构聚类
4. Web Data Extraction Based on Visual Information and Partial Tree Alignment [C] . Siwu Fan, Xinjun Wang, Yongquan Dong Web Information System and Application Conference . 2014

机译：基于视觉信息和部分树对齐的Web数据提取
5. Web-based report generators, data visualization, and Web-to-database connectivity. [D] . Naboulsi, Khaled Samih. 1999

机译：基于Web的报告生成器，数据可视化和Web到数据库的连接。
6. T-BAS Version 2.1: Tree-Based Alignment Selector Toolkit for Evolutionary Placement of DNA Sequences and Viewing Alignments and Specimen Metadata on Curated and Custom Trees [O] . Ignazio Carbone, James B. White, Jolanta Miadlikowska, 2019

机译：T-BAS版本2.1：基于树的比对选择器工具包用于DNA序列的进化放置以及在定制和定制树上查看比对和标本元数据
7. Web data extraction based on partial tree alignment [O] . Yanhong Zhai 2005

机译：基于部分树对齐的Web数据提取

Web Data Extraction Based on Visual Information and Partial Tree Alignment

摘要

著录项

相似文献

相关主题

期刊订阅