首页> 外文期刊>Journal of Information Recording >HIC: A Robust and Efficient Hyper-Image-Based Clustering for Very Large Datasets
【24h】

HIC: A Robust and Efficient Hyper-Image-Based Clustering for Very Large Datasets

机译:HIC:针对超大型数据集的鲁棒高效的基于超图像的群集

获取原文
获取原文并翻译 | 示例
           

摘要

Most existing clustering approaches not only require several scans of a dataset but also have a very high computational cost. In this paper, we propose a novel, efficient, and effective clustering framework which requires only one scan of the input dataset. In the beginning, the original dataset is transformed and merged into a hyper-image. After that, the dissimilarities between data points are measured, once and for all, by using various image-processing methodologies. Then, image segmentation techniques are applied to extract clusters from the hyper-image. The resulting clusters can be further processed to achieve fuzzy and/or hierarchical clustering effects. Moreover, the proposed framework can cluster incrementally and even dynamically with only one scan of the updated records. With this capability, it can also be used to effectively cluster streaming data. Experimental results show that our approach is robust and stable under various parameter settings and data distributions, and it is more powerful and sophisticated than other methodologies.
机译:大多数现有的聚类方法不仅需要对数据集进行多次扫描,而且具有很高的计算成本。在本文中,我们提出了一种新颖,高效的聚类框架,该框架只需要对输入数据集进行一次扫描即可。最初,原始数据集被转换并合并为超图像。之后,通过使用各种图像处理方法,一劳永逸地测量数据点之间的差异。然后,将图像分割技术应用于从超图像中提取聚类。可以对所得的聚类进行进一步处理以实现模糊和/或分层聚类效果。而且,所提出的框架可以仅对更新的记录进行一次扫描就可以逐步甚至动态地进行聚类。借助此功能,它还可用于有效地对流数据进行群集。实验结果表明,我们的方法在各种参数设置和数据分布下都具有鲁棒性和稳定性,并且比其他方法更强大,更复杂。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号