Many applications benefit from finding valuable facts in imprecise data, the diamonds in the dirt, without having to clean the data first. The goal of probabilistic databases is to make uncertainty a first-class citizen, and to reduce the cost of using such data, or (more likely) to enable applications that were otherwise prohibitively expensive. This article describes some of the recent advances for large-scale query processing on probabilistic databases and their roots in classical data management concepts.
展开▼