【24h】

Learning New Word Pronunciations from Spoken Examples

机译:从口语示例中学习新单词的发音

获取原文

摘要

A lexicon containing explicit mappings between words and pronunciations is an integral part of most automatic speech recognizers (ASRs). While many ASR components can be trained or adapted using data, the lexicon is one of the few that typically remains static until experts make manual changes. This work takes a step towards alleviating the need for manual intervention by integrating a popular grapheme-to-phoneme conversion technique with acoustic examples to automatically learn high-quality baseform pronunciations for unknown words. We explore two models in a Bayesian framework, and discuss their individual advantages and shortcomings. We show that both are able to generate better-than-expert pronunciations with respect to word error rate on an isolated word recognition task.
机译:包含单词和发音之间显式映射的词典是大多数自动语音识别器(ASR)不可或缺的部分。尽管可以使用数据来训练或调整许多ASR组件,但该词典是少数在专家进行手动更改之前通常保持静态的组件之一。通过将流行的音素到音素转换技术与声学示例相集成,以自动学习未知单词的高质量基本形式发音,这项工作朝着减轻手动干预的要求迈出了一步。我们在贝叶斯框架中探索两个模型,并讨论它们各自的优点和缺点。我们表明,在孤立的单词识别任务上,两者都能针对单词错误率产生优于专家的发音。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号