Improvement of N-gram language models using accent phrase boundaries

Makoto Terao; Nobuaki Minematsu; Keikichi Hirose

首页> 外文期刊>電子情報通信学会技術研究報告. 言語理解とコミュニケーション. Natural Language Understanding and Models of Communication >Improvement of N-gram language models using accent phrase boundaries

【24h】

Improvement of N-gram language models using accent phrase boundaries

机译：使用重音短语边界改进N-gram语言模型

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Current continuous speech recognition systems make much use of segmental features but little use of prosodic features. This paper proposes a novel method to integrate prosodic boundary information into N-gram-based language modeling. In this method, two types of language sub-models are built. One characterizes word transitions crossing accent phrase boundaries and the other not crossing the boundaries. To realize these two sub-models directly from a speech corpus, its size should be comparable to a text corpus used for N-gram model training. However, the preparation of such a large speech corpus is not realistic. To solve this problem, we focus upon transition of words in terms of their part-of-speech (POS), and differences in FOS transition crossing and not crossing the boundaries are used to generate the two sub-models. Through experiments, the proposed model showed 11% perplexity reduction given the correct boundary position, and 8% reduction with the automatically extracted boundaries. Even when test speech samples were spoken by another speaker than the speaker used in characterizing the POS transitions, 6% reduction was observed.

机译：当前的连续语音识别系统大量使用分段特征，但是很少使用韵律特征。本文提出了一种将韵律边界信息集成到基于N-gram的语言建模中的新方法。在这种方法中，建立了两种类型的语言子模型。一个特征是跨越重音词组边界的单词过渡，而另一个不跨越边界。为了直接从语音语料库中实现这两个子模型，其大小应与用于N-gram模型训练的文本语料库相当。但是，准备这么大的语音语料库是不现实的。为了解决这个问题，我们将重点放在词的词性（POS）方面，并且使用FOS过渡跨越和不跨越边界的差异来生成两个子模型。通过实验，所提出的模型显示出在正确边界位置下的困惑度降低了11％，在自动提取边界的情况下降低了8％。即使当测试语音样本由表征POS转换的说话者以外的其他说话者讲话时，也观察到6％的减少。

著录项

来源
《電子情報通信学会技術研究報告. 言語理解とコミュニケーション. Natural Language Understanding and Models of Communication》 |2001年第520期|共6页
作者
Makoto Terao; Nobuaki Minematsu; Keikichi Hirose;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 jpn
中图分类通信;
关键词
Language model; Prosody; Accent phrase boundary; Transition of part-or-speech; Continuous speech recognition;

机译：语言模型;韵律;重音词组边界;部分或语音转换;连续语音识别;

相似文献

外文文献
中文文献
专利

1. Improvement of N-gram language models using accent phrase boundaries [J] . Makoto Terao, Nobuaki Minematsu, Keikichi Hirose 電子情報通信学会技術研究報告. 言語理解とコミュニケーション. Natural Language Understanding and Models of Communication . 2001,第520期

机译：使用重音短语边界改进N-gram语言模型
2. Improvement of N-gram language models using accent phrase boundaries [J] . Makoto Terao, Nobuaki Minematsu, Keikichi Hirose 電子情報通信学会技術研究報告. 言語理解とコミュニケーション. Natural Language Understanding and Models of Communication . 2001,第520期

机译：使用重音短语界限改进n-gram语言模型
3. Improvement of N-gram language models using accent phrase boundaries [J] . Makoto Terao, Nobuaki Minematsu, Keikichi Hirose 電子情報通信学会技術研究報告. 音声. Speech . 2001,第522期

机译：使用重音短语界限改进n-gram语言模型
4. Adapting Language Models for Frequent Fixed Phrases by Emphasizing N-gram Subsets [C] . Tomoyosi AKIBA, Katunobu ITOU, Atsushi FUJII, European Conference on Speech Communication and Technology . 2003

机译：通过强调n-gram子集来调整频繁固定短语的语言模型
5. Language-independent text learning with statistical n-gram language models. [D] . Peng, Fuchun. 2003

机译：统计n-gram语言模型的独立于语言的文本学习。
6. Assessing Priming for Prosodic Representations: Speaking Rate Intonational Phrase Boundaries and Pitch Accenting [O] . Kristen M. Tooley, Agnieszka E. Konopka, Duane G. Watson -1

机译：评估韵律表征的启动方式：说话率国际短语边界和音高重音
7. Session Boundary Detection for Association Rule Learning Using n-Gram Language Models [O] . Xiangji Huang, Fuchun Peng, Aijun An, 2007

机译：使用n-Gram语言模型进行关联规则学习的会话边界检测
8. Investigation of Back-off Based Interpolation Between Recurrent Neural Network and N-gram Language Models (Author's Manuscript). [R] . Chen, X., Liu, X., Gales, M. J. F., 2016

机译：基于回退的递归神经网络与N-gram语言模型的插值研究（作者手稿）。

Improvement of N-gram language models using accent phrase boundaries

摘要

著录项

相似文献

相关主题

期刊订阅