...
首页> 外文期刊>Computational intelligence and neuroscience >Localized Ambient Solidity Separation Algorithm Based Computer User Segmentation
【24h】

Localized Ambient Solidity Separation Algorithm Based Computer User Segmentation

机译:基于局部环境固相分离算法的计算机用户细分

获取原文

摘要

Most of popular clustering methods typically have some strong assumptions of the dataset. For example, thek-means implicitly assumes that all clusters come from spherical Gaussian distributions which have different means but the same covariance. However, when dealing with datasets that have diverse distribution shapes or high dimensionality, these assumptions might not be valid anymore. In order to overcome this weakness, we proposed a new clustering algorithm named localized ambient solidity separation (LASS) algorithm, using a new isolation criterion called centroid distance. Compared with other density based isolation criteria, our proposed centroid distance isolation criterion addresses the problem caused by high dimensionality and varying density. The experiment on a designed two-dimensional benchmark dataset shows that our proposed LASS algorithm not only inherits the advantage of the original dissimilarity increments clustering method to separate naturally isolated clusters but also can identify the clusters which are adjacent, overlapping, and under background noise. Finally, we compared our LASS algorithm with the dissimilarity increments clustering method on a massive computer user dataset with over two million records that contains demographic and behaviors information. The results show that LASS algorithm works extremely well on this computer user dataset and can gain more knowledge from it.
机译:大多数流行的聚类方法通常都对数据集有很强的假设。例如,thek-means隐含地假设所有聚类均来自球形高斯分布,该球形高斯分布的均值不同但协方差相同。但是,当处理具有多种分布形状或高维的数据集时,这些假设可能不再有效。为了克服这一弱点,我们使用称为质心距离的新隔离标准,提出了一种新的聚类算法,称为局部环境固体分离(LASS)算法。与其他基于密度的隔离标准相比,我们提出的质心距离隔离标准解决了由高尺寸和密度变化引起的问题。在设计的二维基准数据集上进行的实验表明,我们提出的LASS算法不仅继承了原始的差异增量聚类方法的优势,可以将自然隔离的聚类分开,而且还可以识别出相邻,重叠和处于背景噪声下的聚类。最后,我们在包含超过200万条包含人口统计和行为信息的记录的大型计算机用户数据集上,将LASS算法与不相似增量聚类方法进行了比较。结果表明,LASS算法在该计算机用户数据集上运行良好,并且可以从中获得更多知识。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号