...
【24h】

Complexity of extracting database schema from semistructured documents

机译:从半系统中提取数据库架构的复杂性

获取原文
获取原文并翻译 | 示例
           

摘要

Semistructured data comprises irregular structure and has no a-priori database schema, therefore we encounter several problems such as inefficient data retrieval and wasteful data storage. Some heuristic algorithms extracting database schema have been proposed, however, complexity of schema extraction problem has hardly discussed yet. In this paper, we consider an optimization problem to extract a database schema consisting of the least classes such that the density of each class is no less than a given threshold, where the density of a class represents the similarity between the type of the class and those of the objects in the class. We first prove that the corresponding decision problem is strongly NP-hard and belongs to ∑{sub}2P Then we show that for any r < 3/2, there is no polynomial-time r-approximation algorithm that solves the optimization problem unless P = NP.
机译:半系统数据包括不规则结构,并且没有a-priori数据库模式,因此我们遇到了几个问题,例如低效的数据检索和浪费的数据存储。 已经提出了一些提取数据库模式的启发式算法,然而,尚未讨论架构提取问题的复杂性。 在本文中,我们考虑了提取由最小类别组成的数据库模式的优化问题,使得每个类的密度不小于给定阈值,其中类的密度表示类的类型和类之间的相似性。 课堂上的物体的那些。 我们首先证明相应的决策问题是强烈的np - 硬,属于σ{sub} 2p然后我们表明对于任何R <3/2,除非p = np。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号