Mechanisms to improve clustering uncertain data with UKmeans

Chuan-Ming Liu; Zhendong Niu; Kuan-Teng Liao

首页> 外文期刊>Data & Knowledge Engineering >Mechanisms to improve clustering uncertain data with UKmeans

【24h】

Mechanisms to improve clustering uncertain data with UKmeans

机译：使用UKmeans改善不确定数据聚类的机制

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Uncertain data inKmeans clustering, namelyUKmeans, have been discussed in decade years.UKmeans clustering, however, has some difficulties of time performance and effectiveness because of the uncertainty of objects. In this study, we propose some modifiedUKmeans clustering mechanisms to improve the time performance and effectiveness, and to enable the clustering to be more complete. The main issues include (1) reducing the consideration of time performance in clustering, (2) increasing the effectiveness of clustering, and (3) considering the determination of the number of clusters. In time performance, we use simplified object expressions to reduce the time spent in comparing similarities. Regarding the effectiveness of clustering, we propose compounded factors including the distance, the overlapping of clusters and objects, and the cluster density as the clustering standard to determine similarity. In addition, to increase the effectiveness of clustering, we also propose the concept of a cluster boundary, which affects the belongingness of an object by the overlapping factor. Finally, we use the evaluating approach of the number of uncertain clusters to determine the appropriate the number of clusters. In the experiment, clustering results generated using strategies commonly used in processing uncertain data clustering inUKmeans clusters are compared. Our proposed model shows more favorable performance, higher effectiveness of clustering, and a more appropriate number of clusters compared to other models.

机译：十年以来，已经讨论了Kmeans聚类中不确定的数据，即UKmeans。但是，UKmeans聚类由于对象的不确定性而在时间性能和有效性方面存在一些困难。在这项研究中，我们提出了一些经过改进的UKmeans聚类机制，以提高时间性能和有效性，并使聚类更加完整。主要问题包括（1）减少对聚类中时间性能的考虑；（2）提高聚类的有效性；（3）考虑确定聚类数。在时间性能方面，我们使用简化的对象表达式来减少比较相似度所花费的时间。关于聚类的有效性，我们提出了距离，聚类与对象的重叠以及聚类密度等复合因素作为确定相似性的聚类标准。另外，为了提高聚类的有效性，我们还提出了聚类边界的概念，该聚类边界会通过重叠因子影响对象的归属性。最后，我们使用不确定簇数的评估方法来确定合适的簇数。在实验中，比较了使用通常用于处理UKmeans群集中的不确定数据聚类的策略生成的聚类结果。与其他模型相比，我们提出的模型显示出更佳的性能，更高的聚类效率以及更合适的聚类数量。

著录项

来源
《Data & Knowledge Engineering》 |2018年第7期|61-79|共19页
作者
Chuan-Ming Liu; Zhendong Niu; Kuan-Teng Liao;
展开▼
作者单位

Department of Computer Science and Information Engineering, National Taipei University of Technology;

School of Computer Science and Technology, Beijing Institute of Technology;

Department of Computer Science and Information Engineering, National Taipei University of Technology;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Uncertain data; Clustering; Centroid boundary;

机译：不确定数据;聚类;质心边界;

相似文献

外文文献
中文文献
专利

1. Clustering Uncertain Gene Dataset using KLSE (Kullback-Leibler & Shennon Entropy) to improve Cluster Quality [J] . S. Vydehi, M. Punithavalli International Journal of Applied Engineering Research . 2016,第4aPta6期

机译：使用KLSE（Kullback-Leibler和Shennon熵）对不确定基因数据集进行聚类以提高聚类质量
2. An improved algorithm for clustering uncertain traffic data streams based on Hadoop platform [J] . Xu Weixiang, Li Jiaojiao International Journal of Modern Physics, B. Condensed Matter Physics, Statistical Physics, Applied Physics . 2019,第19期

机译：一种改进基于Hadoop平台的不确定交通数据流的改进算法
3. Improved bisector clustering of uncertain data using SDSA method on parallel processors [J] . Luki? Ivica, Slavek Ninoslav, K?hler Mirko Technical Gazette . 2013,第2期

机译：在并行处理器上使用SDSA方法改进了不确定数据的平分线聚类
4. An Effective Clustering Mechanism for Uncertain Data Mining Using Centroid Boundary in UKmeans [C] . Kuan-Teng Liao, Chuan-Ming Liu 2016 International Computer Symposium . 2016

机译：UKmeans中使用质心边界进行不确定数据挖掘的有效聚类机制
5. Improving database performances in a changing environment with uncertain and dynamic information demand: An intelligent database system approach. [D] . Chen, Andrew Nai-Kuang. 1999

机译：在不确定的动态信息需求下，不断变化的环境中提高数据库性能：一种智能数据库系统方法。
6. DAFi: A Directed Recursive Data Filtering and Clustering Approach for Improving and Interpreting Data Clustering Identification of Cell Populations from Polychromatic Flow Cytometry Data [O] . Alexandra J. Lee, Ivan Chang, Julie G. Burel, -1

机译：DAFi：一种有指导性的递归数据过滤和聚类方法用于改进和解释多色流式细胞仪数据对细胞群体的数据聚类识别
7. Uncertain centroid based partitional clustering of uncertain data [O] . Francesco Gullo, Andrea Tagarelli 2012

机译：基于不确定质心的不确定数据的分区聚类
8. The Clustering and Security Mechanisms of a Database Computer (DBC). [R] . Banerjee, J., Hsiao, D. K., Menon, J. 1979

机译：数据库计算机的集群和安全机制（DBC）。

Mechanisms to improve clustering uncertain data with UKmeans

摘要

著录项

相似文献

相关主题

期刊订阅