首页> 外文会议>Human language technology >A LARGE-VOCABULARY CONTINUOUS SPEECH RECOGNITION ALGORITHM AND ITS APPLICATION TO A MULTI-MODAL TELEPHONE DIRECTORY ASSISTANCE SYSTEM
【24h】

A LARGE-VOCABULARY CONTINUOUS SPEECH RECOGNITION ALGORITHM AND ITS APPLICATION TO A MULTI-MODAL TELEPHONE DIRECTORY ASSISTANCE SYSTEM

机译:大语音连续语音识别算法及其在多模态电话目录辅助系统中的应用

获取原文
获取原文并翻译 | 示例

摘要

This paper describes an accurate and efficient algorithm for very-large-vocabulary continuous speech recognition based on an HMM-LR algorithm. The HMM-LR algorithm uses a generalized LR parser as a language model and hidden Markov models (HMMs) as phoneme models. To reduce the search space without pruning the correct candidate, we use forward and backward trellis likelihoods, an adjusting window for choosing only the probable part of the trellis for each predicted phoneme, and an algorithm for merging candidates that have the same allophonic phoneme sequences and the same context-free grammar states. Candidates are also merged at the meaning level. This algorithm is applied to a telephone directory assistance system that recognizes spontaneous speech containing the names and addresses of more than 70,000 subscribers (vocabulary size is about 80,000). The experimental results show that the system performs well in spite of the large perplexity. This algorithm was also applied to a multi-modal telephone directory assistance system, and the system was evaluated from the human-interface point of view. To cope with the problem of background noise, an HMM composition technique which combines a noise-source HMM and a clean phoneme HMM into a noise-added phoneme HMM was investigated and incorporated into the system.
机译:本文介绍了一种基于HMM-LR算法的大型语音连续语音识别的准确高效算法。 HMM-LR算法使用广义的LR解析器作为语言模型,并使用隐马尔可夫模型(HMM)作为音素模型。为了减少搜索空间而不会删减正确的候选音,我们使用前向和后向格构似然性,用于为每个预测音素仅选择网格的可能部分的调整窗口,以及合并具有相同等位音素序列和相同的无上下文语法状态。候选人也将在意义层次上合并。此算法应用于电话号码簿辅助系统,该系统识别包含超过70,000个订户的名称和地址的自发语音(词汇量约为80,000)。实验结果表明,尽管存在很大的困惑,该系统仍能很好地运行。该算法还应用于多模式电话号码簿辅助系统,并从人机界面的角度对系统进行了评估。为了解决背景噪声的问题,研究了一种将噪声源HMM和干净音素HMM组合成加噪音素HMM的HMM合成技术,并将其结合到系统中。

著录项

  • 来源
    《Human language technology》|1994年|387-392|共6页
  • 会议地点 Plainsboro NJ(US)
  • 作者单位

    NTT Human Interface Laboratories 3-9-11 Midori-cho, Musashino-shi, Tokyo, 180 Japan;

    NTT Human Interface Laboratories 3-9-11 Midori-cho, Musashino-shi, Tokyo, 180 Japan;

    NTT Human Interface Laboratories 3-9-11 Midori-cho, Musashino-shi, Tokyo, 180 Japan;

    NTT Human Interface Laboratories 3-9-11 Midori-cho, Musashino-shi, Tokyo, 180 Japan;

    NTT Human Interface Laboratories 3-9-11 Midori-cho, Musashino-shi, Tokyo, 180 Japan;

    NTT Human Interface Laboratories 3-9-11 Midori-cho, Musashino-shi, Tokyo, 180 Japan;

  • 会议组织
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 计算机软件;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号