【24h】

Entropy-Based Mixed Data Transform Model

机译:基于熵的混合数据变换模型

获取原文

摘要

In the era of big data, in-depth data mining is inevitable and urgent. Cluster analysis is one of the major data mining methods. Measuring similarity or distance between two objects is a key step for several data mining and knowledge discovery tasks. For abound unstructured free data, conversion between numeric data and categorical data matters most, while the notion of similarity for numeric data is relatively well-studied and for categorical data not satisfying. Learning from current clustering algorithm for categorical data and mixed data, several methods and corresponding features are explored and summarized. Results on a variety of data sets show that while no one measure dominates others for all types of problems, but some measures are able to be integrated into clustering process. Proposed method has the potential capability to deal with numeric and categorical features (mixed features) of dataset.
机译:在大数据的时代,深入的数据挖掘是不可避免的和紧迫的。集群分析是主要的数据挖掘方法之一。测量两个对象之间的相似性或距离是几个数据挖掘和知识发现任务的关键步骤。对于比比非结构化的免费数据,数字数据和分类数据之间的转换大多数问题,而数字数据的相似性概念相对较好地研究,并且对于不满足的分类数据。从基于分类数据和混合数据的当前聚类算法学习,探讨了多种方法和相应的功能和总结。结果各种数据集显示,虽然没有一种措施为所有类型的问题占据其中,但有些措施能够集成到聚类过程中。提出的方法具有处理数据集的数字和分类功能(混合功能)的潜在能力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号