【24h】

Mapping and cleaning

机译:映射和清洁

获取原文
获取外文期刊封面目录资料

摘要

We address the challenging and open problem of bringing together two crucial activities in data integration and data quality, i.e., transforming data using schema mappings, and fixing conflicts and inconsistencies using data repairing. This problem is made complex by several factors. First, schema mappings and data repairing have traditionally been considered as separate activities, and research has progressed in a largely independent way in the two fields. Second, the elegant formalizations and the algorithms that have been proposed for both tasks have had mixed fortune in scaling to large databases. In the paper, we introduce a very general notion of a mapping and cleaning scenario that incorporates a wide variety of features, like, for example, user interventions. We develop a new semantics for these scenarios that represents a conservative extension of previous semantics for schema mappings and data repairing. Based on the semantics, we introduce a chase-based algorithm to compute solutions. Appropriate care is devoted to developing a scalable implementation of the chase algorithm. To the best of our knowledge, this is the first general and scalable proposal in this direction.
机译:我们解决了在数据集成和数据质量的两个关键活动中实现了一个具有挑战性和开放问题,即使用架构映射转换数据,并使用数据修复来修复冲突和不一致。几个因素使这个问题变得复杂。首先,架构映射和数据修复传统上被认为是单独的活动,并且研究在两个领域以一种基本独立的方式进行了进展。其次,为两个任务提出的优雅形式和算法已经在缩放到大型数据库时已经混合了财富。在论文中,我们介绍了一种非常一般的映射和清洁场景,其包含多种特征,例如用户干预。我们为这些方案开发了一个新的语义,它代表了先前语义的保守扩展,用于模式映射和数据修复。基于语义,我们介绍了一种基于追逐的算法来计算解决方案。致力于开发追逐算法的可扩展实现的适当小心。据我们所知,这是朝这个方向的第一个和可扩展的提案。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号