【24h】

Selecting Diversifying Heuristics for Cluster Ensembles

机译:为集群集合选择多样化的启发式方法

获取原文
获取原文并翻译 | 示例

摘要

Cluster ensembles are deemed to be better than single clustering algorithms for discovering complex or noisy structures in data. Various heuristics for constructing such ensembles have been examined in the literature, e.g., random feature selection, weak clusterers, random projections, etc. Typically, one heuristic is picked at a time to construct the ensemble. To increase diversity of the ensemble, several heuristics may be applied together. However, not any combination may be beneficial. Here we apply a standard genetic algorithm (GA) to select from 7 standard heuristics for k-means cluster ensembles. The ensemble size is also encoded in the chromosome. In this way the data is forced to guide the selection of heuristics as well as the ensemble size. Eighteen moderate-size datasets were used: 4 artificial and 14 real. The results resonate with our previous findings in that high diversity is not necessarily a prerequisite for high accuracy of the ensemble. No particular combination of heuristics appeared to be consistently chosen across all datasets, which justifies the existing variety of cluster ensembles. Among the most often selected heuristics were random feature extraction, random feature selection and random number of clusters assigned for each ensemble member. Based on the experiments, we recommend that the current practice of using one or two heuristics for building k-means cluster ensembles should be revised in favour of using 3-5 heuristics.
机译:对于发现数据中复杂或嘈杂的结构,聚类集成被认为比单聚类算法更好。在文献中已经研究了用于构造这种整体的各种试探法,例如,随机特征选择,弱聚类,随机投影等。通常,一次选择一种试探法来构建整体。为了增加整体的多样性,可以将几种试探法一起应用。但是,没有任何组合可能是有益的。在这里,我们应用标准遗传算法(GA)从7种标准启发式算法中选择k均值聚类集成。集合大小也编码在染色体中。通过这种方式,数据被迫指导启发式方法的选择以及整体大小。使用了18个中等大小的数据集:4个人工数据集和14个真实数据集。该结果与我们之前的发现相呼应,即高多样性不一定是合奏精度高的前提。似乎没有在所有数据集中一致地选择启发式的特定组合,这证明了现有的各种集群合奏是合理的。在最常选择的启发式方法中,包括随机特征提取,随机特征选择和为每个集合成员分配的聚类的随机数。根据实验,我们建议应修改使用一种或两种启发式方法构建k均值聚类集成的当前做法,以利于使用3-5种启发式方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号