Sequence-Based Pronunciation Modeling Using a Noisy-Channel Approach

机译：使用噪声通道方法的基于序列的语音建模

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Previous approaches to spontaneous speech recognition address the multiple pronunciation problem by modeling the alteration of the pronunciation on a phoneme to phoneme level. However, the phonetic transformation effects induced by the pronunciation of the whole sentence are not considered yet. In this paper we attempt to model the sequence-based pronunciation variation using a noisy-channel approach where the spontaneous phoneme sequence is considered as a "noisy" string and the goal is to recover the "clean" string of the word sequence. Hereby, the whole word sequence and its effect on the alternation of the phonemes will be taken into consideration. Moreover, the system not only learns the phoneme transformation but also the mapping from the phoneme to the word directly. In this preliminary study, first the phonemes will be recognized with the present recognition system and afterwards the pronunciation variation model based on the noisy-channel approach will map from the phoneme to the word level. Our experiments use Switchboard as spontaneous speech corpus. The results show that the proposed method improves the word accuracy consistently over the conventional recognition system. The best system achieves up to 38.9% relative improvement to the baseline speech recognition.

机译：自发语音识别的先前方法通过对音素到音素级别的发音变化建模来解决多重发音问题。但是，尚未考虑由整个句子的发音引起的语音转换效果。在本文中，我们尝试使用噪声通道方法对基于序列的发音变化进行建模，其中自发音素序列被视为“噪声”字符串，目标是恢复单词序列的“纯净”字符串。因此，将考虑整个单词序列及其对音素交替的影响。此外，该系统不仅学习音素转换，而且还学习从音素到单词的直接映射。在这项初步研究中，首先将使用当前的识别系统识别音素，然后基于噪声通道方法的语音变化模型将从音素映射到单词级别。我们的实验使用Switchboard作为自发的语音语料库。结果表明，与传统的识别系统相比，该方法能够不断提高词的准确性。最佳系统相对于基准语音识别，可实现高达38.9％的相对改进。

著录项

来源
《Spoken dialogue systems for ambient environments》|2010年|p.156-162|共7页
会议地点 Gotemba(JP);Gotemba(JP);Gotemba(JP)
作者
Hansjorg Hofmann; Sakriani Sakti; Ryosuke Isotani; Hisashi Kawai; Satoshi Nakamura; Wolfgang Minker;
展开▼
作者单位

National Institute of Information and Communications Technology, Japan,University of Ulm, Germany;

National Institute of Information and Communications Technology, Japan;

National Institute of Information and Communications Technology, Japan;

National Institute of Information and Communications Technology, Japan;

National Institute of Information and Communications Technology, Japan;

University of Ulm, Germany;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类人工智能理论;
关键词
spontaneous speech recognition; pronunciation variation; noisy-channel model; statistical machine translation;

机译：自发的语音识别；发音变化噪声通道模型；统计机器翻译;

相似文献

外文文献
中文文献
专利

1. Sequence-Based Pronunciation Variation Modeling for Spontaneous ASR Using a Noisy Channel Approach [J] . Hansj?rg HOFMANN, Sakriani SAKTI, Chiori HORI, IEICE transactions on information and systems . 2012,第8期

机译：自发ASR的基于序列的语音变异模型的噪声通道方法
2. Sequence-Based Pronunciation Variation Modeling for Spontaneous ASR Using a Noisy Channel Approach [J] . Hansjoerg HOFMANN, Sakriani SAKTI, Chiori HORI, IEICE Transactions on Information and Systems . 2012,第8期

机译：自发ASR的基于序列的语音变异模型的噪声通道方法
3. Extended Noisy-Channel Models for Capacitively Coupled Personal Area Network Under Influence of a Wall [J] . Sasaki A.-I., Mizota T., Morimura H., IEEE Transactions on Antennas and Propagation . 2014,第5期

机译：墙影响下的电容耦合个人局域网的扩展噪声通道模型
4. Sequence-Based Pronunciation Modeling Using a Noisy-Channel Approach [C] . Hansjorg Hofmann, Sakriani Sakti, Ryosuke Isotani, International Workshop on Spoken Dialogue Systems Technology . 2010

机译：使用嘈杂通道方法的序列的发音建模
5. Phonemic Awareness Transfer from Spanish to English: A Way to Approach English Pronunciation =TRANSFERENCIA DE CONCIENCIA FONOLóGICA DEL ESPA?OL AL INGLéS: UNA MANERA DE ABORDAR LA PRONUNCIACIóN EN INGLéS [D] . Méndez Rojas, Jhon Jairo. 2020

机译：从西班牙语到英语的音素意识转移：一种方法发音=在英语中转移西班牙西班牙的意识：一种解决英语发音的方法
6. Exploring a Phonological Process Approach to Adult Pronunciation Training [O] . Amber Franklin, Lana McDaniel -1

机译：探索成人发音训练的语音过程方法
7. A Noisy-Channel Approach to Question Answering [O] . 2008

机译：噪声通道的问题解答方法
8. Noisy-Channel Approach to Question Answering [R] . Echihabi, A. , Marcu, D. 2003

机译：嘈杂通道解决问题的方法

Sequence-Based Pronunciation Modeling Using a Noisy-Channel Approach

摘要

著录项

相似文献

相关主题

期刊订阅