A new notion of empirical informational divergence between two individual sequences is introduced. If the two sequences are independent realizations of two stationary Markov processes, the empirical relative entropy converges to the true divergence almost surely. This new empirical divergence is based on a version of the Lempel-Ziv data compression algorithm. A simple universal classification algorithm for individual sequences into a finite number of classes which is based on the empirical divergence, is introduced. It discriminates between the classes whenever they are distinguishable by some finite-memory classifier, for almost every given training sets and almost any test sequence from these classes. It is universal in the sense of being independent of the unknown sources.
展开▼