Phonetic decision tree based state tying has been widely used in most large vocabulary continuous speech recognition (LVCSR) systems. However, in most cases, the samples of different leaf nodes are very unbalanced, which may affect the recognition performance. In This work, node merging techniques are proposed to alleviate the problem and further decrease the number of senones. On the other hand, in order to lessen the impact of rare triphones on the quality of the decision tree based state tying and improve the accuracy of every final senone, two methods of dealing with rare triphones are added to hidden Markov model (HMM) acoustic modeling before state tying. Experimental results show that these methods greatly improve the robustness of the decision tree and can achieve better performance with even fewer parameters.
展开▼