首页> 外文期刊>Procedia Computer Science >Hierarchical Density-Based Clustering Based on GPU Accelerated Data Indexing Strategy
【24h】

Hierarchical Density-Based Clustering Based on GPU Accelerated Data Indexing Strategy

机译:基于GPU加速数据索引策略的基于层次密度的聚类

获取原文
           

摘要

Due the recent increase of the volume of data that has been generated, organizing this data has become one of the biggest problems in Computer Science. Among the different strategies propose to deal efficiently and effectively for this purpose, we highlight those related to clustering, more specifically, density-based clustering strategies, which stands out for its ability to define clusters of arbitrary shape and the robustness to deal with the presence of data noise, such as DBSCAN and OPTICS. However, these algorithms are still a computational challenge since they are distance-based proposals. In this work we present a new approach to make OPTICS feasible based on data indexing strategy. Although the simplicity with which the data are indexed, using graphs, it allows explore various parallelization opportunities, which were explored using graphic processing unit (GPU). Based on this structure, the complexity of OPTICS is reduced to O ( E *logV ) in the worst case, becoming itself very fast. In our evaluation we show that our proposal can be over 200 x faster than its sequential version using CPU.
机译:由于最近已生成的数据量增加,因此组织这些数据已成为计算机科学中的最大问题之一。在为此目的建议有效地进行有效处理的不同策略中,我们重点介绍了与聚类相关的那些策略,更具体地说,是基于密度的聚类策略,该策略突出了其定义任意形状的聚类的能力以及处理存在性的鲁棒性。数据噪声,例如DBSCAN和OPTICS。但是,由于这些算法是基于距离的建议,因此仍然是计算难题。在这项工作中,我们提出了一种基于数据索引策略使OPTICS可行的新方法。尽管使用图形对数据进行索引的简单性,但它允许探索各种并行化机会,这些机会是使用图形处理单元(GPU)进行探索的。基于这种结构,在最坏的情况下,OPTICS的复杂性降低为O(E * logV),变得非常快。在我们的评估中,我们表明我们的提案比使用CPU的顺序版本要快200倍以上。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号