Most well-known partitioning clustering algorithms adopt an iterative procedure to converge to the stable status. One problem is that the quality of clustering and execution time is especially sensitive to initial conditions (e.g. initial cluster centers and cluster number). In addition, the method used to measure similarity between two transaction data is also an important factor. In general, the similarity method is established in advance and usually employs metric-based distance measuring, which does not consider the variation in the content. The disadvantage is that an analyst is unable to modify the measuring method to suit the need of a particular analysis. In this paper, therefore, we propose a novel constrained clustering algorithm called CCKD (short for Constrained Clustering depend on Known data Distribution). With CCKD, the analyst is able to specify the constrains for measuring similarity that set conditions on capturing clusters. In addition, our empirical results indicate that CCKD is an effective and stable algorithm without any iterative procedure.
展开▼