首页> 外文会议> >Density-based clustering of uncertain data
【24h】

Density-based clustering of uncertain data

机译:基于密度的不确定数据聚类

获取原文

摘要

In many different application areas, e.g. sensor databases, location based services or face recognition systems, distances between odjects have to be computed based on vague and uncertain data. Commonly, the distances between these uncertain object descriptions are expressed by one numerical distance value. Based on such single-valued distance functions standard data mining algorithms can work without any changes. In this paper, we propose to express the similarity between two fuzzy objects by distance probability functions. These fuzzy distance functions assign a probability value to each possible distance value. By integrating these fuzzy distance functions directly into data mining algorithms, the full information provided by these functions is exploited. In order to demonstrate the benefits of this general approach, we enhance the density-based clustering algorithm DBSCAN so that it can work directly on these fuzzy distance functions. In a detailed experimental evaluation based on artificial and real-world data sets, we show the characteristics and benefits of our new approach.
机译:在许多不同的应用领域,例如传感器数据库,基于位置的服务或面部识别系统,必须基于模糊和不确定的数据来计算目标之间的距离。通常,这些不确定对象描述之间的距离由一个数值距离值表示。基于这种单值距离函数,标准数据挖掘算法可以正常工作。在本文中,我们建议通过距离概率函数来表达两个模糊对象之间的相似性。这些模糊距离函数将概率值分配给每个可能的距离值。通过将这些模糊距离函数直接集成到数据挖掘算法中,可以利用这些函数提供的全部信息。为了证明这种通用方法的好处,我们增强了基于密度的聚类算法DBSCAN,使其可以直接在这些模糊距离函数上工作。在基于人工和真实数据集的详细实验评估中,我们展示了这种新方法的特点和优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号