Open-world probabilistic databases: Semantics, algorithms, complexity

Ismail Ilkan Ceylan; Adnan Darwiche; Guy Van den Broeck

首页> 外文期刊>Artificial intelligence >Open-world probabilistic databases: Semantics, algorithms, complexity

【24h】

Open-world probabilistic databases: Semantics, algorithms, complexity

机译：开放世界概率数据库：语义，算法，复杂性

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Large-scale probabilistic knowledge bases are becoming increasingly important in academia and industry. They are continuously extended with new data, powered by modern information extraction tools that associate probabilities with knowledge base facts. The state of the art to store and process such data is founded on probabilistic databases. Many systems based on probabilistic databases, however, still have certain semantic deficiencies, which limit their potential applications. We revisit the semantics of probabilistic databases, and argue that the closed-world assumption of probabilistic databases, i.e., the assumption that facts not appearing in the database have the probability zero, conflicts with the everyday use of large-scale probabilistic knowledge bases. To address this discrepancy, we propose open-world probabilistic databases, as a new probabilistic data model. In this new data model, the probabilities of unknown facts, also called open facts, can be assigned any probability value from a default probability interval. Our analysis entails that our model aligns better with many real-world tasks such as query answering, relational learning, knowledge base completion, and rule mining. We make various technical contributions. We show that the data complexity dichotomy, between polynomial time and #P, for evaluating unions of conjunctive queries on probabilistic databases can be lifted to our open-world model. This result is supported by an algorithm that computes the probabilities of the so-called safe queries efficiently. Based on this algorithm, we prove that evaluating safe queries is in linear time for probabilistic databases, under reasonable assumptions. This remains true in open-world probabilistic databases for a more restricted class of safe queries. We extend our data complexity analysis beyond unions of conjunctive queries, and obtain a host of complexity results for both classical and open-world probabilistic databases. We conclude our analysis with an in-depth investigation of the combined complexity in the respective models.

机译：大型概率知识库在学术界和工业中越来越重要。它们与新数据持续扩展，由现代信息提取工具提供支持，该工具与知识库事实相关联。存储和处理此类数据的最先进的概率数据库。然而，基于概率数据库的许多系统仍然具有某些语义缺陷，这限制了它们的潜在应用程序。我们重新审视概率数据库的语义，并争辩说概率的概率数据库的闭合假设，即，假设数据库中未出现的事实的假设具有概率为零，与日常使用大规模概率知识库的冲突。为了解决这种差异，我们提出了开放世界的概率数据库，作为一个新的概率数据模型。在这种新数据模型中，可以从默认概率间隔分配任何称为开放事实的未知事实的概率。我们的分析需要我们的模型更好地对待许多真实世界任务，如查询应答，关系学习，知识库完成和规则挖掘。我们进行了各种技术贡献。我们表明，多项式时间和#P之间的数据复杂性二分法可以向我们的开放世界模型提升到概率数据库上的联合查询的联合查询。该结果由算法支持，该算法可以有效地计算所谓的安全查询的概率。基于该算法，我们证明评估安全查询是概率数据库的线性时间，在合理的假设下。在开放世界的概率数据库中，这仍然是正确的，以获得更受限制的安全查询。我们将数据复杂性分析扩展超出联合查询的工会，并为古典和开放世界的概率数据库获得了一系列复杂性结果。我们在深入调查各个模型中的合并复杂性的深入调查方面，我们得出结论。

著录项

来源
《Artificial intelligence 》 |2021年第6期| 103474.1-103474.34| 共34页
作者
Ismail Ilkan Ceylan; Adnan Darwiche; Guy Van den Broeck;
展开▼
作者单位

Department of Computer Science University of Oxford Wolfson Building Parks Road Oxford OX1 3QD UK;

Computer Science Department University of California Los Angeles 404 Westwood Plaza Los Angeles CA 90095 United States of America;

Computer Science Department University of California Los Angeles 404 Westwood Plaza Los Angeles CA 90095 United States of America;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Knowledge bases; Probabilistic databases; Semantics; Closea-world assumption; Open-world assumption; Inference; Credal sets; Learning; Data complexity; Dichotomy; Lifted inference;

机译：知识库;概率数据库;语义;赛德 - 世界假设;开放世界的假设;推理;贷项套;学习;数据复杂性;二分法;提出推理;

相似文献

外文文献
中文文献
专利

1. Complexity analysis of the SAT engine: DNA algorithms as probabilistic algorithms [J] . Masami Hagiya, John A. Rose, Ken Komiya, Theoretical computer science . 2002 ,第1期

机译：SAT引擎的复杂性分析：DNA算法作为概率算法
2. Collaborative recommendation algorithm based on probabilistic matrix factorization in probabilistic latent semantic analysis [J] . Huang Li, Tan Wenan, Sun Yong Multimedia Tools and Applications . 2019 ,第7期

机译：概率潜在语义分析中基于概率矩阵分解的协同推荐算法
3. Collaborative recommendation algorithm based on probabilistic matrix factorization in probabilistic latent semantic analysis [J] . Huang Li, Tan Wenan, Sun Yong Multimedia Tools and Applications . 2019 ,第7期

机译：基于概率矩阵分解在概率潜在语义分析中的协作推荐算法
4. On Constrained Open-World Probabilistic Databases [C] . Tal Friedman, Guy Van den Broeck International Joint Conference on Artificial Intelligence . 2020

机译：关于约束的开放世界概率数据库
5. Replicators, majorization and probabilistic databases: New approaches for the analysis of evolutionary algorithms [D] . Menon, Anil Ravindran 1998

机译：复制器，专业化和概率数据库：进化算法分析的新方法
6. A machine-learning expert-supporting system for diagnosis prediction of lymphoid neoplasms using a probabilistic decision-tree algorithm and immunohistochemistry profile database [O] . Yosep Chong, Ji Young Lee, Yejin Kim, 2020

机译：一种机器学习专家支持系统用于使用概率决策树算法和免疫组化配置文件数据库诊断淋巴瘤肿瘤肿瘤的诊断预测
7. On Constrained Open-World Probabilistic Databases [O] . Tal Friedman, Guy Van den Broeck 2019

机译：关于约束的开放世界概率数据库

Open-world probabilistic databases: Semantics, algorithms, complexity

摘要

著录项

相似文献

相关主题

期刊订阅