首页> 外文期刊>Journal of Zhejiang University. Science >A statistical information-based clustering approach in distance space
【24h】

A statistical information-based clustering approach in distance space

机译:距离空间中基于统计信息的聚类方法

获取原文
获取原文并翻译 | 示例
           

摘要

Clustering, as a powerful data mining technique for discovering interesting data distributions and patterns in the underlying database, is used in many fields, such as statistical data analysis, pattern recognition, image processing, and other business applications. Density-based Spatial Clustering of Applications with Noise (DBSCAN) (Ester et al, 1996) is a good performance clustering method for dealing with spatial data although it leaves many problems to be solved. For example, DBSCAN requires a necessary user-specified threshold while its computation is extremely time-consuming by current method such as OPTICS, etc. (Ankerst et al., 1999), and the performance of DBSCAN under different norms has yet to be examined. In this paper, we first developed a method based on statistical information of distance space in database to determine the necessary threshold. Then our examination of the DBSCAN performance under different norms showed that there was determinable relation between them. Finally, we used two artificial databases to verify the effectiveness and efficiency of the proposed methods.
机译:聚类是一种强大的数据挖掘技术,用于发现基础数据库中有趣的数据分布和模式,在许多领域中都使用了它,例如统计数据分析,模式识别,图像处理和其他业务应用程序。基于密度的带噪声应用空间聚类(DBSCAN)(Ester等,1996)是一种处理空间数据的性能优良的聚类方法,尽管它仍有许多问题需要解决。例如,DBSCAN需要一个必要的用户指定阈值,而使用OPTICS等当前方法进行计算却非常耗时(Ankerst等,1999),并且在不同规范下DBSCAN的性能尚待检验。 。在本文中,我们首先开发了一种基于数据库中距离空间的统计信息来确定必要阈值的方法。然后,我们在不同规范下对DBSCAN性能的检查表明,它们之间存在可确定的关系。最后,我们使用两个人工数据库来验证所提出方法的有效性和效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号