首页> 外文会议>Proceedings of 2010 International Conference on Methods and Models in Computer Science >Applications of clustering algorithms and self organizing maps as data mining and business intelligence tools on real world data sets
【24h】

Applications of clustering algorithms and self organizing maps as data mining and business intelligence tools on real world data sets

机译:聚类算法和自组织图作为现实世界数据集上的数据挖掘和商业智能工具的应用

获取原文

摘要

Partitioning a large set of objects into homogeneous clusters is a fundamental operation in data mining. The k-means algorithm is best suited for implementing this operation because of its efficiency in clustering large data sets. In this paper we present a comparative study on different clustering algorithms with respect to k - means clustering to work on large data sets. In this paper we present a comparison among some nonhierarchical and hierarchical clustering algorithms including SOM (Self-Organization Map) neural networks methods. Data were simulated considering correlated and uncorrelated variables, non overlapping and overlapping clusters with and without outliers. Tested with Telecommunication Users and Iris Flower data set, the comparative algorithms had demonstrated a very good classification performance. Experiments on a very large telecommunication data set set consisting of 1000 records and 32 categorical attributes & Iris Flower data set consisting of 150 samples show that the SOM clustering with respect to k means & hierarchical clustering algorithm is scalable in terms of both the number of clusters and the number of records.
机译:将大量对象划分为同类集群是数据挖掘中的基本操作。 k-均值算法最适合于执行此操作,因为它在聚类大数据集方面效率很高。在本文中,我们针对k-均值聚类适用于大型数据集的不同聚类算法进行了比较研究。在本文中,我们对包括SOM(自组织图)神经网络方法在内的一些非分层和分层聚类算法进行了比较。考虑相关和不相关变量,具有或没有异常值的非重叠和重叠聚类,对数据进行模拟。经过电信用户和Iris Flower数据集的测试,比较算法证明了很好的分类性能。对包含1000条记录和32个类别属性的超大型电信数据集以及包含150个样本的鸢尾花数据集进行的实验表明,就k均值而言的SOM聚类和层次聚类算法在聚类数量方面均具有可扩展性和记录数。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号