...
首页> 外文期刊>Journal of intelligent & fuzzy systems: Applications in Engineering and Technology >Feature reduction fuzzy C-Means algorithm leveraging the marginal kurtosis measure
【24h】

Feature reduction fuzzy C-Means algorithm leveraging the marginal kurtosis measure

机译:特征减少模糊C型算法利用边缘峰度措施

获取原文
获取原文并翻译 | 示例

摘要

The feature reduction fuzzy c-means (FRFCM) algorithm has been proven to be effective for clustering data with redundant/unimportant feature(s). However, the FRFCM algorithm still has the following disadvantages. 1) The FRFCM uses the mean-to-variance-ratio (MVR) index to measure the feature importance of a dataset, but this index is affected by data normalization, i.e., a large MVR value of original feature(s) may become small if the data are normalized, and vice versa. Moreover, the MVR value(s) of the important feature(s) of a dataset may not necessarily be large. 2) The feature weights obtained by the FRFCM are sensitive to the initial cluster centers and initial feature weights. 3) The FRFCM algorithm may be unable to assign the proper weights to the features of a dataset. Thus, in the feature reduction learning process, important features may be discarded, but unimportant features may be retained. These disadvantages can cause the FRFCM algorithm to discard important feature components. In addition, the threshold for the selection of the important feature(s) of the FRFCM may not be easy to determine. To mitigate the disadvantages of the FRFCM algorithm, we first devise a new index, named the marginal kurtosis measure (MKM), to measure the importance of each feature in a dataset. Then, a novel and robust feature reduction fuzzy c-means clustering algorithm called the FRFCM-MKM, which incorporates the marginal kurtosis measure into the FRFCM, is proposed. Furthermore, an accurate threshold is introduced to select important feature(s) and discard unimportant feature(s). Experiments on synthetic and real-world datasets demonstrate that the FRFCM-MKM is effective and efficient.
机译:特征约简模糊c均值(FRFCM)算法已被证明对具有冗余/不重要特征的数据聚类是有效的。然而,FRFCM算法仍然存在以下缺点。1) FRFCM使用均值-方差比(MVR)指数来衡量数据集的特征重要性,但该指数受数据标准化的影响,即,如果数据标准化,原始特征的大MVR值可能变小,反之亦然。此外,数据集重要特征的MVR值不一定很大。2) FRFCM得到的特征权重对初始聚类中心和初始特征权重敏感。3) FRFCM算法可能无法为数据集的特征分配适当的权重。因此,在特征约简学习过程中,重要特征可能会被丢弃,但不重要的特征可能会被保留。这些缺点会导致FRFCM算法丢弃重要的特征组件。此外,选择FRFCM重要特征的阈值可能不容易确定。为了缓解FRFCM算法的缺点,我们首先设计了一个新的索引,称为边际峰度度量(MKM),用于度量数据集中每个特征的重要性。然后,提出了一种新的、鲁棒的特征约简模糊c均值聚类算法FRFCM-MKM,该算法将边缘峭度测度引入到FRFCM中。此外,还引入了一个精确的阈值来选择重要的特征并丢弃不重要的特征。在合成数据集和真实数据集上的实验表明,FRFCM-MKM是有效的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号