Retrieving Deep Web Data Through Multi-Attributes Interfaces With Structured Queries

JIAN-WEI TIAN; WEN-HUI QI; XIAO-XIAO LIU

首页> 外文期刊>International journal of software engineering and knowledge engineering >Retrieving Deep Web Data Through Multi-Attributes Interfaces With Structured Queries

【24h】

Retrieving Deep Web Data Through Multi-Attributes Interfaces With Structured Queries

机译：通过具有结构化查询的多属性接口检索深层Web数据

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

A great deal of data on the Web lies in the hidden databases, or the deep Web. Most of the deep Web data is not directly available and can only be accessed through the query interfaces. Current research on deep Web search has focused on crawling the deep Web data via Web interfaces with keywords queries. However, these keywords-based methods have inherent limitations because of the multi-attributes and top-fc features of the deep Web. In this paper we propose a novel approach for siphoning structured data with structured queries. Firstly, in order to retrieve all the data non-repeatedly in hidden databases, we model the hidden database as a hierarchy tree. Under this theoretical framework, data retrieving is transformed into the traversing problem in a tree. We also propose techniques to narrow the query space by using heuristic rule, based on mutual information, to guide the traversal process. We conduct extensive experiments over real deep Web sites and controlled databases to illustrate the coverage and efficiency of our techniques.

机译：Web上的大量数据都位于隐藏的数据库或深层Web中。大多数深层Web数据不是直接可用的，只能通过查询界面进行访问。当前对深度Web搜索的研究集中在通过带有关键字查询的Web界面对深度Web数据进行爬网。但是，由于深层Web的多属性和top-fc功能，这些基于关键字的方法具有固有的局限性。在本文中，我们提出了一种通过结构化查询虹吸结构化数据的新颖方法。首先，为了非重复地检索隐藏数据库中的所有数据，我们将隐藏数据库建模为层次树。在这种理论框架下，数据检索被转换为树中的遍历问题。我们还提出了基于互信息，通过启发式规则来缩小查询空间的技术，以指导遍历过程。我们在真实的深层网站和受控数据库上进行了广泛的实验，以说明我们技术的覆盖范围和效率。

著录项

来源
《International journal of software engineering and knowledge engineering》 |2011年第4期|p.523-542|共20页
作者
JIAN-WEI TIAN; WEN-HUI QI; XIAO-XIAO LIU;
展开▼
作者单位

Hunan Electric Power Corporation Research Institute State Grid Corporation of China Changsha, 410007, China;

Hunan Electric Power Corporation Research Institute State Grid Corporation of China Changsha, 410007, China;

Hunan Electric Power Corporation Research Institute State Grid Corporation of China Changsha, 410007, China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Hidden databases; data retrieval; multi-attribute interfaces; top-A: tuples;

机译：隐藏数据库;数据检索;多属性接口;top-A：元组;

相似文献

外文文献
中文文献
专利

1. Query generation for retrieving data from distributed semistructured documents using a metadata interface [J] . Guija Choe, Young-Kwang Nam, Joseph Goguen, Computer languages . 2009,第4期

机译：查询生成，用于使用元数据接口从分布式半结构化文档中检索数据
2. Special issue on querying the data web Novel techniques for querying structured data on the web [J] . Paolo Ceravolo, Chengfei Liu, Mustafa Jarrar, World Wide Web . 2011,第5a6期

机译：有关查询数据Web的特殊问题用于查询Web结构化数据的新颖技术
3. Information technology(IT) in strategic and tactical planning by the fire service part Ⅲ: structured query language(SQL)relational databases, how they work, how to construct a Central Risk Register and how to write programmes to retrieve stored info [J] . R.a.Klein Fire Engineers Journal . 1998,第192期

机译：消防服务第三部分的战略和战术规划中的信息技术（IT）：结构化查询语言（SQL）关系数据库，它们的工作方式，如何构建中央风险登记簿以及如何编写程序以检索存储的信息
4. Organizing Structured Deep Web by Clustering Query Interfaces Link Graph [C] . Pengpeng Zhao, Li Huang, Wei Fang, Advanced data mining and applications . 2008

机译：通过聚类查询接口链接图来组织结构化深度Web
5. SEEDEEP: A system for exploring and querying deep web data sources. [D] . Wang, Fan. 2010

机译：SEEDEEP：一种用于浏览和查询深层Web数据源的系统。
6. The OMA orthology database in 2018: retrieving evolutionary relationships among all domains of life through richer web and programmatic interfaces [O] . Adrian M Altenhoff, Natasha M Glover, Clément-Marie Train, 2018

机译：2018年的OMA正交数据库：通过更丰富的网络和程序化界面检索生活各个领域之间的进化关系
7. Special issue on querying the data web: novel techniques for querying structured data on the web [O] . Ceravolo, Paolo, Liu, Chengfei, Jarrar, Mustafa, 2011

机译：查询数据网络的特殊问题：用于在Web上查询结构化数据的新技术
8. Level 1 Ada/SQL (Structured Query Language) Database Language Interface User's Guide [R] . Brykczynski, B., Friedman, F., Hilliard, K., 1987

机译：1级ada / sQL（结构化查询语言）数据库语言界面用户指南

Retrieving Deep Web Data Through Multi-Attributes Interfaces With Structured Queries

摘要

著录项

相似文献

相关主题

期刊订阅