首页> 外文学位 >Robust techniques and applications in fuzzy clustering.
【24h】

Robust techniques and applications in fuzzy clustering.

机译:鲁棒技术及其在模糊聚类中的应用。

获取原文
获取原文并翻译 | 示例

摘要

This dissertation addresses issues central to fuzzy classification. The issue of sensitivity to noise and outliers of least squares minimization based clustering techniques, such as Fuzzy c-Means (FCM) and its variants is addressed. In this work, two novel and robust clustering schemes are presented and analyzed in detail. They approach the problem of robustness from different perspectives. The first scheme scales down the FCM memberships of data points based on the distance of the points from the cluster centers. Scaling done on outliers reduces their membership in true clusters. This scheme, known as the Mega-clustering, defines a conceptual mega-cluster which is a collective cluster of all data points but views outliers and good points differently (as opposed to the concept of Dave's Noise cluster). The scheme is presented and validated with experiments and similarities with Noise Clustering (NC) are also presented. The other scheme is based on the feasible solution algorithm that implements the Least Trimmed Squares (LTS) estimator. The LTS estimator is known to be resistant to noise and has a high breakdown point. The feasible solution approach also guarantees convergence of the solution set to a global optima. Experiments show the practicability of the proposed schemes in terms of computational requirements and in the attractiveness of their simplistic frameworks.; The issue of validation of clustering results has often received less attention than clustering itself. Fuzzy and non-fuzzy cluster validation schemes are reviewed and a novel methodology for cluster validity using a test for random position hypothesis is developed. The random position hypothesis is tested against an alternative clustered hypothesis on every cluster produced by the partitioning algorithm. The Hopkins statistic is used as a basis to accept or reject the random position hypothesis, which is also the null hypothesis in this case. The Hopkins statistic is known to be a fair estimator of randomness in a data set. The concept is borrowed from the clustering tendency domain and its applicability to validating clusters is shown here.; A unique feature selection procedure for use with large molecular conformational datasets with high dimensionality is also developed. The intelligent feature extraction scheme not only helps in reducing dimensionality of the feature space but also helps in eliminating contentious issues such as the ones associated with labeling of symmetric atoms in the molecule. The feature vector is converted to a proximity matrix, and is used as an input to the relational fuzzy clustering (FRC) algorithm with very promising results. Results are also validated using several cluster validity measures from literature. Another application of fuzzy clustering considered here is image segmentation. Image analysis on extremely noisy images is carried out as a precursor to the development of an automated real time condition state monitoring system for underground pipelines. A two-stage FCM with intelligent feature selection is implemented as the segmentation procedure and results on a test image are presented. A conceptual framework for automated condition state assessment is also developed.
机译:本文解决了模糊分类的核心问题。解决了对噪声的敏感性以及基于最小二乘最小化的聚类技术(例如模糊c均值(FCM)及其变体)的离群值问题。在这项工作中,提出并详细分析了两种新颖且健壮的聚类方案。他们从不同的角度探讨了鲁棒性问题。第一种方案根据点到聚类中心的距离按比例缩小数据点的FCM成员资格。在离群值上进行缩放可减少其在真实群集中的成员资格。这种称为“大型群集”的方案定义了一个概念上的大型群集,它是所有数据点的聚集群集,但是以不同的方式查看离群值和优点(与Dave的“噪声群集”的概念相对)。提出了该方案并通过实验进行了验证,还提出了与噪声聚类(NC)的相似性。另一方案基于实现最小修剪平方(LTS)估计器的可行解算法。众所周知,LTS估计器具有抗噪声能力,并且具有很高的击穿点。可行的解决方案方法还可以确保将解决方案集收敛到全局最优值。实验表明,从计算要求和简化框架的吸引力来看,所提方案是可行的。与聚类本身相比,聚类结果验证的问题通常受到较少的关注。审查了模糊和非模糊聚类验证方案,并开发了一种使用随机位置假设检验的新型聚类有效性方法。针对分区算法产生的每个聚类上的替代聚类假设,测试了随机位置假设。霍普金斯统计量用作接受或拒绝随机位置假设的基础,在这种情况下,随机位置假设也是无效假设。众所周知,霍普金斯统计量是数据集中随机性的合理估计。该概念是从聚类趋势域中借用的,此处显示了其在验证聚类中的适用性。还开发了用于具有高维的大分子构象数据集的独特特征选择程序。智能特征提取方案不仅有助于减少特征空间的维数,而且还有助于消除有争议的问题,例如与分子中对称原子的标记有关的问题。特征向量被转换为邻近矩阵,并被用作关系模糊聚类(FRC)算法的输入,具有非常可喜的结果。还使用文献中的几种聚类有效性度量来验证结果。这里考虑的模糊聚类的另一个应用是图像分割。对非常嘈杂的图像进行图像分析是作为开发用于地下管线的自动实时状态监测系统的前提。实现了具有智能特征选择的两阶段FCM作为分割过程,并在测试图像上显示了结果。还开发了用于自动状态评估的概念框架。

著录项

  • 作者

    Banerjee, Amit.;

  • 作者单位

    New Jersey Institute of Technology.;

  • 授予单位 New Jersey Institute of Technology.;
  • 学科 Engineering Mechanical.; Computer Science.
  • 学位 Ph.D.
  • 年度 2005
  • 页码 132 p.
  • 总页数 132
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 机械、仪表工业;自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号