首页> 外文会议>International Conference on Advanced Engineering Computing and Application in Sciences >An Application of Data Mining to Identify Data Quality Problems
【24h】

An Application of Data Mining to Identify Data Quality Problems

机译:数据挖掘在识别数据质量问题的应用

获取原文

摘要

Modern information systems consist of many distributed computer and database systems. The integration of such distributed data into a single data warehouse system is confronted with the well known problem of low data quality. In this paper we present an approach that facilitates a dynamic identification of spurious and error-prone data stored in a large data warehouse. The identification of data quality problems is based on data mining techniques, such as clustering, subspace clustering and classification. Furthermore, we present via a case study the applicability of our approach on real data. The experimental results show that our approach efficiently identifies data quality problems.
机译:现代信息系统由许多分布式计算机和数据库系统组成。将这种分布式数据集成到单个数据仓库系统中面对众所周知的低数据质量问题。在本文中,我们提出了一种方法,这有助于动态识别存储在大数据仓库中的虚假和容易出错的数据。数据质量问题的识别是基于数据挖掘技术,例如聚类,子空间聚类和分类。此外,我们通过案例研究了我们对真实数据的方法的适用性。实验结果表明,我们的方法有效地识别了数据质量问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号