In this paper we present a mehtod for buildign compact lattices for very large vocabularies, which has been appleid to surname recognition in an Interactive telephone-based Directory Assistance Services system. The method involves the construction of a non-deterministic DAWG, which is eventually transformed into a phoneme lattice in Entropic's HTK Application Programming Interface (HAPI) format. Incremental construction functions are used for the creation and update of the DAWG, whereas an algorithm for converting the DAWG into the HAPI format is presented. Furthermore, trees, graphs and full-forms (whole words with no merging of nodes) are compared in a straightforward way under the same conditions, using hte same decoder (HAPI MVX) and the same vocabularies. Experimetnal results shwoed that as we go from full-form lexicons to trees and then to graphs the size of the recognition network is reduced and therefore the recognition time too. Hwever, recognition accuracy is retained since the same phoneme combinations are involved.
展开▼