In the present contribution we consider sequence learning by means of unsupervised and supervised vector quantization, which should be invariant regarding to shifts in the sequences. A mathematical tool to achieve a respective invariant representation and comparison of sequences are Hankel matrices with an appropriate dissimilarity measure based on subspace angles. We discuss their mathematical properties and show how they can be incorporated in prototype based vector quantization schemes like neural gas and self-organizing maps for clustering and data compression in case of unsupervised learning. For classification learning we refer to the closely related supervised learning vector quantization scheme. Particularly, median variants of these vector quantizers allow an easy application of Hankel matrices. A possible application of the Hankel matrix approach could be the analysis of DNA sequences, as it does not require the alignment of sequences due to its invariance properties.
展开▼