Chow and Liu introduced an algorithm for fitting a multivariate distribution with a tree (i.e. a density model that assumes that there are only pairwise dependencies between variables) and that the graph of these dependencies is a spanning tree. The original algorithm is quadratic in the dimension of the domain, and linear in the number of data points that define the target distribution P. This paper shows that for sparse, discrete data, fitting a tree distribution can be done in time and memory that is typically sub-quadratic in the number of variables and the size of the data set. The new algorithm, called the acCL algorithm, achieves speed ups of several orders of magnitude in the experiments.
展开▼