...
首页> 外文期刊>Artificial intelligence >Open-world probabilistic databases: Semantics, algorithms, complexity
【24h】

Open-world probabilistic databases: Semantics, algorithms, complexity

机译:开放世界概率数据库:语义,算法,复杂性

获取原文
获取原文并翻译 | 示例

摘要

Large-scale probabilistic knowledge bases are becoming increasingly important in academia and industry. They are continuously extended with new data, powered by modern information extraction tools that associate probabilities with knowledge base facts. The state of the art to store and process such data is founded on probabilistic databases. Many systems based on probabilistic databases, however, still have certain semantic deficiencies, which limit their potential applications. We revisit the semantics of probabilistic databases, and argue that the closed-world assumption of probabilistic databases, i.e., the assumption that facts not appearing in the database have the probability zero, conflicts with the everyday use of large-scale probabilistic knowledge bases. To address this discrepancy, we propose open-world probabilistic databases, as a new probabilistic data model. In this new data model, the probabilities of unknown facts, also called open facts, can be assigned any probability value from a default probability interval. Our analysis entails that our model aligns better with many real-world tasks such as query answering, relational learning, knowledge base completion, and rule mining. We make various technical contributions. We show that the data complexity dichotomy, between polynomial time and #P, for evaluating unions of conjunctive queries on probabilistic databases can be lifted to our open-world model. This result is supported by an algorithm that computes the probabilities of the so-called safe queries efficiently. Based on this algorithm, we prove that evaluating safe queries is in linear time for probabilistic databases, under reasonable assumptions. This remains true in open-world probabilistic databases for a more restricted class of safe queries. We extend our data complexity analysis beyond unions of conjunctive queries, and obtain a host of complexity results for both classical and open-world probabilistic databases. We conclude our analysis with an in-depth investigation of the combined complexity in the respective models.
机译:大型概率知识库在学术界和工业中越来越重要。它们与新数据持续扩展,由现代信息提取工具提供支持,该工具与知识库事实相关联。存储和处理此类数据的最先进的概率数据库。然而,基于概率数据库的许多系统仍然具有某些语义缺陷,这限制了它们的潜在应用程序。我们重新审视概率数据库的语义,并争辩说概率的概率数据库的闭合假设,即,假设数据库中未出现的事实的假设具有概率为零,与日常使用大规模概率知识库的冲突。为了解决这种差异,我们提出了开放世界的概率数据库,作为一个新的概率数据模型。在这种新数据模型中,可以从默认概率间隔分配任何称为开放事实的未知事实的概率。我们的分析需要我们的模型更好地对待许多真实世界任务,如查询应答,关系学习,知识库完成和规则挖掘。我们进行了各种技术贡献。我们表明,多项式时间和#P之间的数据复杂性二分法可以向我们的开放世界模型提升到概率数据库上的联合查询的联合查询。该结果由算法支持,该算法可以有效地计算所谓的安全查询的概率。基于该算法,我们证明评估安全查询是概率数据库的线性时间,在合理的假设下。在开放世界的概率数据库中,这仍然是正确的,以获得更受限制的安全查询。我们将数据复杂性分析扩展超出联合查询的工会,并为古典和开放世界的概率数据库获得了一系列复杂性结果。我们在深入调查各个模型中的合并复杂性的深入调查方面,我们得出结论。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号