首页> 外文会议>INTERSPEECH 2012 >Improvements in Japanese Voice Search
【24h】

Improvements in Japanese Voice Search

机译:日语语音搜索的改进

获取原文

摘要

This paper describes work on Japanese voice-search at Yahoo! Japan. We first describe several implementation details of our WFST-based internal decoder which make the voice-search task more efficient including a simple, but effective, compressed WFST arc representation. This permits a ~2Gb memory decoder process for a 1 million word vocabulary and 35 million N-gram language model. We then describe our baseline system using the decoder and compare it against two open-source decoders, Juicer and Julius. We also describe our initial attempts to adapt the baseline system through simple language model adaptation using manually transcribed anonymized voice queries. To achieve this we present a sequence of WFST operations which preserve consistency of segmentation between manual and automatic transcriptions. We show that even using this simple adaptation method we obtain a relative reduction of up to 4.6% in sentence error rate and 8.2% in character error rate.
机译:本文介绍了日本语音搜索的工作,在雅虎!日本。我们首先描述了基于WFST的内部解码器的几个实现细节,使语音搜索任务更有效,包括简单但有效,压缩的WFST弧表示。这允许为100万字词汇和3500万N-GRAM语言模型的〜2GB内存解码器过程。然后,我们使用解码器描述我们的基线系统,并将其与两个开源解码器,榨汁机和朱叶进行比较。我们还通过手动转录的匿名语音查询来描述通过简单的语言模型适应来调整基线系统的初步尝试。为了实现这一目标,我们呈现了一系列WFST操作,它保留了手动和自动转录之间的分割的一致性。我们表明,即使使用这种简单的适应方法,我们也可以在句子错误率中获得高达4.6%的相对减少,并且字符错误率为8.2%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号