Learning Speech Variability in Discriminative Acoustic Model Adaptation

Shoei SATO; Takahiro OKU; hinichi HOMMA; Akio KOBAYASHI; Toru IMAI

首页> 外文期刊>IEICE Transactions on Information and Systems >Learning Speech Variability in Discriminative Acoustic Model Adaptation

【24h】

Learning Speech Variability in Discriminative Acoustic Model Adaptation

机译：学习判别声学模型自适应中的语音变异性

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We present a new discriminative method of acoustic model adaptation that deals with a task-dependent speech variability. We have focused on differences of expressions or speaking styles between tasks and set the objective of this method as improving the recognition accuracy of indistinctly pronounced phrases dependent on a speaking style.The adaptation appends subword models for frequently observable variants of subwords in the task. To find the task-dependent variants, low-confidence words are statistically selected from words with higher frequency in the task's adaptation data by using their word lattices. HMM parameters of subword models dependent on the words are discriminatively trained by using linear transforms with a minimum phoneme error (MPE) criterion. For the MPE training, subword accuracy discriminating between the variants and the originals is also investigated. In speech recognition experiments, the proposed adaptation with the subword variants reduced the word error rate by 12.0% relative in a Japanese conversational broadcast task.

机译：我们提出了一种新的声学模型自适应判别方法，该方法可处理与任务相关的语音可变性。我们着重研究了任务之间的表达或说话风格的差异，并将此方法的目标设定为提高依赖于说话风格的不清晰发音短语的识别准确度。该适应方法为任务中经常出现的子词变体附加了子词模型。为了找到与任务相关的变体，使用其单词格从任务的适应性数据中频率较高的单词中统计选择低置信度单词。通过使用具有最小音素错误（MPE）准则的线性变换来区别地训练依赖于单词的子单词模型的HMM参数。对于MPE训练，还研究了区分变体和原始词的子词准确性。在语音识别实验中，拟议的带有子词变体的改编相对于日语会话广播任务而言，将词错误率降低了12.0％。

著录项

来源
《IEICE Transactions on Information and Systems》 |2010年第9期|P.2370-2378|共9页
作者
Shoei SATO; Takahiro OKU; hinichi HOMMA; Akio KOBAYASHI; Toru IMAI;
展开▼
作者单位

NHK(Japan Broadcasting Corporation) Science & Technology Research Laboratories, Tokyo, 157-8510 Japan;

rnNHK(Japan Broadcasting Corporation) Science & Technology Research Laboratories, Tokyo, 157-8510 Japan;

rnNHK(Japan Broadcasting Corporation) Science & Technology Research Laboratories, Tokyo, 157-8510 Japan;

rnNHK(Japan Broadcasting Corporation) Science & Technology Research Laboratories, Tokyo, 157-8510 Japan;

rnNHK(Japan Broadcasting Corporation) Science & Technology Research Laboratories, Tokyo, 157-8510 Japan;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
speech recognition; speech variability; discriminative training; acoustic model;

机译：语音识别;语音变异性歧视性培训;声学模型;
入库时间 2022-08-18 00:26:59

相似文献

外文文献
中文文献
专利

1. Learning Speech Variability in Discriminative Acoustic Model Adaptation [J] . Shoei SATO, Takahiro OKU, Shinichi HOMMA, IEICE transactions on information and systems . 2010,第9期

机译：学习判别声学模型适应中的语音变异性
2. Morpho-Phonetic Effects in Speech Production: Modeling the Acoustic Duration of English Derived Words With Linear Discriminative Learning [J] . Simon David Stein, Ingo Plag Frontiers in Psychology . 2021,第a期

机译：语音生产中的语音术效应：用线性辨别学习建模英语衍生词的声学持续时间
3. Acoustic model adaptation based on pronunciation variability analysis for non-native speech recognition [J] . Yoo Rhee Oh, Jae Sam Yoon, Hong Kook Kim Speech Communication . 2007,第1期

机译：基于语音变异性分析的声学模型自适应用于非母语语音识别
4. Learning task-dependent speech variability in discriminative acoustic model adaptation [C] . Sato, Shoei, Oku, Takahiro, Homma, Shinichi, IEEE International Conference on Acoustics Speech and Signal;ICASSP 2010 . 2010

机译：在判别声学模型自适应中学习与任务相关的语音可变性
5. Acoustic modeling for automatic speech recognition: Deriving discriminative Gaussian networks. [D] . Teunen, Remco. 2003

机译：用于自动语音识别的声学模型：推导判别式高斯网络。
6. Morpho-Phonetic Effects in Speech Production: Modeling the Acoustic Duration of English Derived Words With Linear Discriminative Learning [O] . Simon David Stein, Ingo Plag 2021

机译：语音生产中的语音拼音效应：用线性鉴别学习建模英语衍生词的声学持续时间
7. Words from spontaneous conversational speech can be recognized with human-like accuracy by an error-driven learning algorithm that discriminates between meanings straight from smart acoustic features, bypassing the phoneme as recognition unit [O] . Arnold Denis, Tomaschek Fabian, Sering Konstantin, 2017

机译：通过错误驱动的学习算法，可以区分自发会话语音中的单词，其准确性与人类类似，可以从智能声学特征中直接区分出含义，而绕过音素作为识别单元

Learning Speech Variability in Discriminative Acoustic Model Adaptation

摘要

著录项

相似文献

相关主题

期刊订阅