We propose to apply the concept of irrelevant variability normalization to the general problem of learning structure from data. Because of the problems of a diversified training data set and/or possible acoustic mismatches between training and testing conditions, the structure learned from the training data by using a maximum likelihood training method will not necessarily generalize well on mismatched tasks. We apply the above concept to the structural learning problem of phonetic decision-tree based hidden Markov model (HMM) state tying. We present a new method that integrates a linear-transformation based normalization mechanism into the decision-tree construction process to make the learned structure have a better modeling capability and generalizability. The viability and efficacy of the proposed method are confirmed in a series of experiments for continuous speech recognition of Mandarin Chinese.
展开▼