【24h】

Past Very Large Vocabulary Recognition Based on Compact Dawg-Structured Language Models

机译:基于紧凑Dawg结构语言模型的过去非常大的词汇识别

获取原文

摘要

In this paper we present a mehtod for buildign compact lattices for very large vocabularies, which has been appleid to surname recognition in an Interactive telephone-based Directory Assistance Services system. The method involves the construction of a non-deterministic DAWG, which is eventually transformed into a phoneme lattice in Entropic's HTK Application Programming Interface (HAPI) format. Incremental construction functions are used for the creation and update of the DAWG, whereas an algorithm for converting the DAWG into the HAPI format is presented. Furthermore, trees, graphs and full-forms (whole words with no merging of nodes) are compared in a straightforward way under the same conditions, using hte same decoder (HAPI MVX) and the same vocabularies. Experimetnal results shwoed that as we go from full-form lexicons to trees and then to graphs the size of the recognition network is reduced and therefore the recognition time too. Hwever, recognition accuracy is retained since the same phoneme combinations are involved.
机译:在本文中,我们提出了一种用于构建非常大的词汇表的紧凑矩阵的方法,该方法已在基于交互式电话的目录服务系统中实现了姓氏识别。该方法涉及非确定性DAWG的构造,最终将其转换为Entropic的HTK应用程序编程接口(HAPI)格式的音素格。增量构造函数用于DAWG的创建和更新,而提出了一种将DAWG转换为HAPI格式的算法。此外,在相同条件下,使用相同的解码器(HAPI MVX)和相同的词汇表,可以直接比较树,图和完整格式(没有节点合并的整个单词)。实验结果表明,当我们从完整的词典到树然后到图上时,识别网络的大小减少了,因此识别时间也减少了。但是,由于涉及相同的音素组合,因此保留了识别精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号