Complexity of extracting database schema from semistructured documents

Nobutaka Suzuki; Yoichirou Sato; Michiyoshi Hayase

首页> 外文期刊>電子情報通信学会技術研究報告. コンピュテ-ション. Theoretical Foundations of Computing >Complexity of extracting database schema from semistructured documents

【24h】

Complexity of extracting database schema from semistructured documents

机译：从半系统中提取数据库架构的复杂性

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Semistructured data comprises irregular structure and has no a-priori database schema, therefore we encounter several problems such as inefficient data retrieval and wasteful data storage. Some heuristic algorithms extracting database schema have been proposed, however, complexity of schema extraction problem has hardly discussed yet. In this paper, we consider an optimization problem to extract a database schema consisting of the least classes such that the density of each class is no less than a given threshold, where the density of a class represents the similarity between the type of the class and those of the objects in the class. We first prove that the corresponding decision problem is strongly NP-hard and belongs to ∑{sub}2P Then we show that for any r < 3/2, there is no polynomial-time r-approximation algorithm that solves the optimization problem unless P = NP.

机译：半系统数据包括不规则结构，并且没有a-priori数据库模式，因此我们遇到了几个问题，例如低效的数据检索和浪费的数据存储。已经提出了一些提取数据库模式的启发式算法，然而，尚未讨论架构提取问题的复杂性。在本文中，我们考虑了提取由最小类别组成的数据库模式的优化问题，使得每个类的密度不小于给定阈值，其中类的密度表示类的类型和类之间的相似性。课堂上的物体的那些。我们首先证明相应的决策问题是强烈的np - 硬，属于σ{sub} 2p然后我们表明对于任何R <3/2，除非p = np。

著录项

来源
《電子情報通信学会技術研究報告. コンピュテ-ション. Theoretical Foundations of Computing》 |2000年第705期|共8页
作者
Nobutaka Suzuki; Yoichirou Sato; Michiyoshi Hayase;
展开▼
作者单位

asteur.ivic.ve;

asteur.ivic.ve;

asteur.ivic.ve;

展开▼
收录信息
原文格式 PDF
正文语种 jpn
中图分类计算技术、计算机技术;
关键词
Semistructured data; Schema extraction problem; Strong NP-hardness; Approximation algorithm;

机译：半系统数据;架构提取问题;强NP硬度;近似算法;

相似文献

外文文献
中文文献
专利

1. Complexity of extracting database schema from semistructured documents [J] . Nobutaka Suzuki, Yoichirou Sato, Michiyoshi Hayase 電子情報通信学会技術研究報告. コンピュテ-ション. Theoretical Foundations of Computing . 2000,第705期

机译：从半系统中提取数据库架构的复杂性
2. Complexity of extracting database schema from semistructured documents [J] . Nobutaka Suzuki, Yoichirou Sato, Michiyoshi Hayase 電子情報通信学会技術研究報告. コンピュテ-ション. Theoretical Foundations of Computing . 2000,第705期

机译：从半系统中提取数据库架构的复杂性
3. Search Optimization in Semistructured Databases Using Hierarchy of Document Schemas [J] . S. S. Gorelov, V. A. Vasenin Programming and Computer Software . 2005,第6期

机译：使用文档架构层次结构的半结构化数据库中的搜索优化
4. On Extracting a Database Schema from Semistructured Documents [C] . Nobutaka SUZUKI, Yoichirou SATO, Michiyoshi HAYASE World Multiconference on Systemics, Cybernetics and Informatics(SCI 2001) v.14: Computer Science and Engineering pt.2; 20010722-20010725; Orlando,FL; US . 2001

机译：从半结构化文档中提取数据库模式
5. Schema Profiling for Document Databases: System Development and Case Studies [D] . Zhao, Zunchen 2019

机译：文档数据库的模式分析：系统开发和案例研究
6. Representing Multi-Database Study Schemas for Reusability [O] . Judith R. Logan, Scott Britell, Lois M.L. Delcambre, 2010

机译：表示可重用性的多数据库研究方案
7. A Transformation Technique of XML DTD to Relational Database Schema Based On Extracting Common Structure in XML Documents [O] . 2002

机译：基于提取XML文档中的公共结构的关系数据库模式的XML DTD转换技术

Complexity of extracting database schema from semistructured documents

摘要

著录项

相似文献

相关主题

期刊订阅