On Extracting a Database Schema from Semistructured Documents

机译：从半结构化文档中提取数据库模式

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

In semistructured data, the data structure is irregular and no explicit database schema is given, which causes several problems such as inefficient data retrieval and wasteful data storage. To cope with such problems, some algorithms extracting database schema from semistructured data have been proposed, in which data is modeled as an unordered tree. However, the order of elements is indispensable for document data, therefore we model data as an ordered tree and consider a problem of extracting an optimum database schema from semistructured data. We first show that the corresponding decision problem is strongly NP-complete. We next propose a polynomial-time algorithm for extracting a database schema.

机译：在半结构化数据中，数据结构是不规则的，没有给出明确的数据库架构，这会导致一些问题，例如数据检索效率低下和数据存储浪费。为了解决这些问题，已经提出了一些从半结构化数据中提取数据库模式的算法，其中将数据建模为无序树。但是，元素的顺序对于文档数据是必不可少的，因此我们将数据建模为有序树，并考虑了从半结构化数据中提取最佳数据库模式的问题。我们首先表明，相应的决策问题是强烈的NP完全问题。接下来，我们提出用于提取数据库模式的多项式时间算法。

著录项

来源
《World Multiconference on Systemics, Cybernetics and Informatics(SCI 2001) v.14: Computer Science and Engineering pt.2; 20010722-20010725; Orlando,FL; US》|2001年|P.220-225|共6页
会议地点 Orlando FL(US);Orlando FL(US);Orlando FL(US);Orlando FL(US)
作者
Nobutaka SUZUKI; Yoichirou SATO; Michiyoshi HAYASE;
展开▼
作者单位

Department of Information and System Engineering Okayama Prefectural University Soja, Okayama 719-1197, Japan;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
semistructured data; schema extraction problem; strong NP-completeness; document database;

机译：半结构化数据;模式提取问题;强NP完整性;文档数据库;

相似文献

外文文献
中文文献
专利

1. Complexity of extracting database schema from semistructured documents [J] . Nobutaka Suzuki, Yoichirou Sato, Michiyoshi Hayase 電子情報通信学会技術研究報告. コンピュテ-ション. Theoretical Foundations of Computing . 2000,第705期

机译：从半系统中提取数据库架构的复杂性
2. Complexity of extracting database schema from semistructured documents [J] . Nobutaka Suzuki, Yoichirou Sato, Michiyoshi Hayase 電子情報通信学会技術研究報告. コンピュテ-ション. Theoretical Foundations of Computing . 2000,第705期

机译：从半系统中提取数据库架构的复杂性
3. Search Optimization in Semistructured Databases Using Hierarchy of Document Schemas [J] . S. S. Gorelov, V. A. Vasenin Programming and Computer Software . 2005,第6期

机译：使用文档架构层次结构的半结构化数据库中的搜索优化
4. On Extracting a Database Schema from Semistructured Documents [C] . Nobutaka SUZUKI, Yoichirou SATO, Michiyoshi HAYASE World multiconference on systemics, cybernetics and informatics . 2001

机译：从半系统文档中提取数据库架构
5. Structured, unstructured, and semistructured search in semistructured databases. [D] . Balmin, Andrey. 2006

机译：半结构化数据库中的结构化，非结构化和半结构化搜索。
6. Representing Multi-Database Study Schemas for Reusability [O] . Judith R. Logan, Scott Britell, Lois M.L. Delcambre, 2010

机译：表示可重用性的多数据库研究方案
7. A Transformation Technique of XML DTD to Relational Database Schema Based On Extracting Common Structure in XML Documents [O] . 2002

机译：基于提取XML文档中的公共结构的关系数据库模式的XML DTD转换技术
8. D4M 2.0 Schema: A General Purpose High Performance Schema for the Accumulo Database. [R] . Kepner, J., Anderson, C., Arcand, W., 2016

机译：D4m 2.0架构：accumulo数据库的通用高性能架构。

On Extracting a Database Schema from Semistructured Documents

摘要

著录项

相似文献

相关主题

期刊订阅