This paper discusses the problem of finding the number of component clusters in gray-level image histograms. These histograms are often modeled using a standard mixture of univariate normal densities. The problem, however, is that the number of components in the mixture is an unknown variable that must be estimated, together with the means and the variances. Computing the number of components in a mixture usually requires "unsupervised learning". This problem is denoted as "cluster validation" in the cluster analysis literature. The aim is to identify sub-populations believed to be present in a population. A wide variety of methods have been proposed for this purpose. In this paper, we intend to compare two methods, each belonging to a typical approach. The first, somewhat classical method, is based on criterion optimization. We are particularly interested in the Akaike's information criterion. The second method is based on a direct approach that makes use of a cluster's geometric properties. In this paper, we develop an algorithm to generate non-overlapped test vectors, allowing the generation of a large set of verified vectors that can be used to perform objective evaluation and comparison.
展开▼