Proper nouns present a challenging problem for current speech recognition technology since they often do not follow typical letter-to-sound conversion rules. Several different automated methods, Boltzmann machines, decision trees, and recurrent neural networks have been attempted in the literature, yet no single system has achieved an acceptable error rate. Since the project goal is the generation of pronunciation dictionaries for speech recognition, however, we can easily combine the multiple outputs of the multiple systems and use the total database coverage as our scoring metric. For generating at least one correct pronunciation for all names, combining all systems gives us a 19.6% error rate, a 23.1% absolute reduction over the best previous system. For generating every pronunciation in the database the combined system rates at 29.1%, a 23.6% reduction.
展开▼