Hashing-Based Approximate DBSCAN

机译：基于哈希的近似DBSCAN

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Analyzing massive amounts of data and extracting value from it has become key across different disciplines. As the amounts of data grow rapidly, however, current approaches for data analysis struggle. This is particularly true for clustering algorithms where distance calculations between pairs of points dominate overall time. Crucial to the data analysis and clustering process, however, is that it is rarely straightforward. Instead, parameters need to be determined through several iterations. Entirely accurate results are thus rarely needed and instead we can sacrifice precision of the final result to accelerate the computation. In this paper we develop ADvaNCE, a new approach to approximating DBSCAN. ADvaNCE uses two measures to reduce distance calculation overhead: (1) locality sensitive hashing to approximate and speed up distance calculations and (2) representative point selection to reduce the number of distance calculations. Our experiments show that our approach is in general one order of magnitude faster (at most 30× in our experiments) than the state of the art.

机译：分析大量数据和从中提取价值已成为不同学科的关键。然而，随着数据量快速增长，数据分析斗争的当前方法。对于聚类算法尤其如此，其中距离点对占主导地位的距离计算。然而，对数据分析和聚类过程至关重要的是，它很少是直截了当的。相反，需要通过几个迭代来确定参数。因此，很少需要完全准确的结果，而是可以牺牲最终结果的精确度以加速计算。在本文中，我们提前提前，一种近似DBSCAN的新方法。前进使用两种措施来减少距离计算开销：（1）临时敏感散列近似和加速距离计算和（2）代表点选择，以减少距离计算的数量。我们的实验表明，我们的方法一般一般一级（我们的实验中最多30倍）比现有技术更快。

著录项

来源
《East European Conference on Advances in Databases and Information Systems》|2016年|354p|共15页
会议地点
作者
Tianrun Li; Thomas Heinis; Wayne Luk;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP311.13-53;
关键词

相似文献

外文文献
中文文献
专利

1. AA-DBSCAN: an approximate adaptive DBSCAN for finding clusters with varying densities [J] . Kim Jeong-Hun, Choi Jong-Hyeok, Yoo Kwan-Hee, Journal of supercomputing . 2019,第1期

机译：AA-DBSCAN：一种近似的自适应DBSCAN，用于查找具有不同密度的聚类
2. On Hashing-Based Approaches to Approximate DNF-Counting [J] . Kuldeep S. Meel, Aditya A. Shrotri, Moshe Y. Vardi LIPIcs : Leibniz International Proceedings in Informatics . 2018,第23期

机译：基于散列的近似DNF计数方法
3. DBScan and WrapDBScan methods applying for intellectual variance analysis in employee’s moving [J] . P.A. Savenkov, A.N. Ivutin Procedia Computer Science . 2021,第a期

机译：DBSCAN和WRAPDBSCAN方法在员工移动中申请智力方差分析
4. Hashing-Based Approximate DBSCAN [C] . Tianrun Li, Thomas Heinis, Wayne Luk East European conference on advances in databases and information systems . 2016

机译：基于散列的近似DBSCAN
5. A generic attack on hashing-based software tamper resistance. [D] . Wurster, Glenn. 2005

机译：对基于哈希的软件防篡改的一般攻击。
6. An Active Learning Method Based on Variational Autoencoder and DBSCAN Clustering [O] . Fang Chen, Tao Zhang, Ruilin Liu 2021

机译：基于变化性AutiaceCoder和DBSCAN群集的主动学习方法
7. Optimal Hashing-based Time-Space Trade-offs for Approximate Near Neighbors [O] . Andoni, Alexandr, Laarhoven, Thijs, Razenshteyn, Ilya, 2017

机译：近似的基于哈希的最优时空权衡邻居

Hashing-Based Approximate DBSCAN

摘要

著录项

相似文献

相关主题

期刊订阅