A connection between channel modeling in mathematical information theory and a certain extension of Levenshtein distances is established. The model assigns positive real weights to elementary editing operations substitution, deletion, insertion, and ending, which may depend on arbitrary finite contexts. Given a context structure, an algorithm for estimating context-dependent probabilities of elementary editing operations from a given finite training corpus is designed. This algorithm is similar to the EM algorithm, the maximization step being replaced by an estimation step to determine probability structures from weighted counts. Some conditions on the context structure are formulated which make this estimation problem algorithmically accessible by reducing it to the general problem of estimating a probability vector on a finite state space from counts.ud
展开▼