首页> 外文期刊>IEICE Transactions on Information and Systems >Sequence-Based Pronunciation Variation Modeling for Spontaneous ASR Using a Noisy Channel Approach
【24h】

Sequence-Based Pronunciation Variation Modeling for Spontaneous ASR Using a Noisy Channel Approach

机译:自发ASR的基于序列的语音变异模型的噪声通道方法

获取原文
获取原文并翻译 | 示例
       

摘要

The performance of English automatic speech recognition systems decreases when recognizing spontaneous speech mainly due to multiple pronunciation variants in the utterances. Previous approaches address this problem by modeling the alteration of the pronunciation on a phoneme to phoneme level. However, the phonetic transformation effects induced by the pronunciation of the whole sentence have not yet been considered. In this article, the sequence-based pronunciation variation is modeled using a noisy channel approach where the spontaneous phoneme sequence is considered as a "noisy" string and the goal is to recover the "clean" string of the word sequence. Hereby, the whole word sequence and its effect on the alternation of the phonemes will be taken into consideration. Moreover, the system not only learns the phoneme transformation but also the mapping from the phoneme to the word directly. In this study, first the phonemes will be recognized with the present recognition system and afterwards the pronunciation variation model based on the noisy channel approach will map from the phoneme to the word level. Two well-known natural language processing approaches are adopted and derived from the noisy channel model theory: Joint-sequence models and statistical machine translation. Both of them are applied and various experiments are conducted using microphone and telephone of spontaneous speech.
机译:识别自发语音时,英语自动语音识别系统的性能下降主要是由于发声中的多个发音变体。先前的方法通过对音素到音素级别的发音变化进行建模来解决此问题。但是,尚未考虑由整个句子的发音引起的语音转换效果。在本文中,使用有噪通道方法对基于序列的发音变化进行建模,其中自发音素序列被视为“有噪声”字符串,目标是恢复单词序列的“纯净”字符串。因此,将考虑整个单词序列及其对音素交替的影响。此外,该系统不仅学习音素转换,而且还学习从音素到单词的直接映射。在这项研究中,首先将使用当前的识别系统识别音素,然后基于噪声通道方法的发音变化模型将从音素映射到单词级别。采用两种著名的自然语言处理方法,它们是从噪声通道模型理论派生而来的:联合序列模型和统计机器翻译。两者都被应用,并且使用麦克风和自发语音电话进行了各种实验。

著录项

  • 来源
    《IEICE Transactions on Information and Systems》 |2012年第8期|p.2084-2093|共10页
  • 作者单位

    National Institute of Information and Communications Technology, Kyoto-fu, 619-0289 Japan Department of Information Technology,University of Ulm, Germany;

    National Institute of Information and Communications Technology, Kyoto-fu, 619-0289 Japan Institute of Science and Technology,Nara, Japan;

    National Institute of Information and Communications Technology, Kyoto-fu, 619-0289 Japan;

    National Institute of Information and Communications Technology, Kyoto-fu, 619-0289 Japan;

    National Institute of Information and Communications Technology, Kyoto-fu, 619-0289 Japan Presently, with Nara Institute of Science and Technology,Nara, Japan;

    Department of Information Technol-ogy, University of Ulm, Germany;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    spontaneous speech; noisy channel approach; joint-sequence models; statistical machine translation;

    机译:自发的讲话;噪声通道法;联合序列模型;统计机器翻译;
  • 入库时间 2022-08-18 00:26:19

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号