Abstract Distance-based neural network clustering requires the intrinsic assumption that a particular neuron in the network represents a cluster centroid. However, not all these neurons can perfectly represent the training data; these neurons can only represent part of the training samples. This paper proposes an effective training data splitting method (TDSM) to find perfect representative neurons and improve the clustering results in a distance-based neutral network without changing the original network’s internal algorithm or the training data quality. The method allows a network with N neurons to be enlarged to a new network with m×Ndocumentclass12pt{minimal} usepackage{amsmath} usepackage{wasysym} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage{mathrsfs} usepackage{upgreek} setlength{oddsidemargin}{-69pt} begin{document}$$mtimes N$$end{document} neurons. These neurons represent m subnetworks, and each subnetwork perfectly represents a part of the training set, where the clustering qualification indicators (the purity, normalized mutual information, and adjusted rand index measures) all equal 1. The results are statistically validated with a t test, and we demonstrate that the TDSM performs better than the original clustering paradigm on some real datasets.
展开▼