首页> 外文会议>International Multiconference on Computer Science and Information Technology >Application of clustering and association methods in data cleaning
【24h】

Application of clustering and association methods in data cleaning

机译:聚类和关联方法在数据清洁中的应用

获取原文

摘要

Data cleaning is a process of maintaining data quality in information systems. Current data cleaning solutions require reference data to identify incorrect or duplicate entries. This article proposes usage of data mining in the area of data cleaning as effective in discovering reference data and validation rules from the data itself. Two algorithms designed by the author for data attribute correction have been presented. Both algorithms utilize data mining methods. Experimental results show that both algorithms can effectively clean text attributes without external reference data.
机译:数据清洁是在信息系统中维护数据质量的过程。当前数据清洁解决方案需要参考数据来识别不正确或重复的条目。本文提出了在数据清洁区域中使用数据挖掘,如有有效从数据本身发现参考数据和验证规则。呈现了由作者设计的两个算法进行数据属性校正。这两种算法都利用了数据挖掘方法。实验结果表明,两种算法都可以有效地清洁没有外部参考数据的文本属性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号