Statistical parametric speech synthesis with a novel codebook-based excitation model

Tamas Gabor Csapo; Geza Nemeth

首页> 外文期刊>Intelligent decision technologies >Statistical parametric speech synthesis with a novel codebook-based excitation model

【24h】

Statistical parametric speech synthesis with a novel codebook-based excitation model

机译：统计参数语音合成与基于新型密码本的激励模型

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Speech synthesis is an important modality in Cognitive Infocommunications, which is the intersection of informatics and cognitive sciences. Statistical parametric methods have gained importance in speech synthesis recently. The speech signal is decomposed to parameters and later restored from them. The decomposition is implemented by speech coders. We apply a novel codebook-based speech coding method to model the excitation of speech. In the analysis stage the speech signal is analyzed frame-by-frame and a codebook of pitch synchronous excitations is built from the voiced parts. Timing, gain and harmonic-to-noise ratio parameters are extracted and fed into the machine learning stage of Hidden Markov-model based speech synthesis. During the synthesis stage the codebook is searched for a suitable element in each voiced frame and these are concatenated to create the excitation signal, from which the final synthesized speech is created. Our initial experiments show that the model fits well in the statistical parametric speech synthesis framework and in most cases it can synthesize speech in a better quality than the traditional pulse-noise excitation. (This paper is an extended version of [10].)

机译：语音合成是认知信息通信中的一种重要形式，它是信息学与认知科学的交集。统计参数方法最近在语音合成中变得越来越重要。语音信号被分解为参数，然后从中恢复。分解由语音编码器实现。我们应用一种新颖的基于密码本的语音编码方法来对语音激励进行建模。在分析阶段，对语音信号进行逐帧分析，并从发声部分构建音高同步激励的码本。提取时间，增益和谐波噪声比参数，并将其输入到基于隐马尔可夫模型的语音合成的机器学习阶段。在合成阶段，在每个有声帧中搜索码本以寻找合适的元素，然后将它们级联以创建激励信号，从中创建最终的合成语音。我们的初步实验表明，该模型非常适合统计参数语音合成框架，并且在大多数情况下，与传统的脉冲噪声激励相比，它可以以更好的质量合成语音。（本文是[10]的扩展版本。）

著录项

来源
《Intelligent decision technologies 》 |2014年第4期| 289-299| 共11页
作者
Tamas Gabor Csapo; Geza Nemeth;
展开▼
作者单位

Department of Telecommunications and Media Informatics, Budapest University of Technology and Economics,Budapest, Hungary;

Department of Telecommunications and Media Informatics, Budapest University of Technology and Economics,Budapest, Hungary;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Text-to-speech synthesis; speech processing; excitation model; vocoding; parametric;

机译：文本到语音合成;语音处理;激励模型声码;参数;

相似文献

外文文献
中文文献
专利

1. Excitation modelling using epoch features for statistical parametric speech synthesis [J] . M Kiran Reddy, K Sreenivasa Rao Computer speech and language . 2020 ,第Mara期

机译：使用纪元特征进行激励建模以进行统计参数语音合成
2. Modeling Irregular Voice in Statistical Parametric Speech Synthesis With Residual Codebook Based Excitation [J] . Selected Topics in Signal Processing, IEEE Journal of . 2014 ,第2期

机译：基于残余码本的激励在统计参量语音合成中建模不规则语音
3. GlotNet—A Raw Waveform Model for the Glottal Excitation in Statistical Parametric Speech Synthesis [J] . Juvela Lauri, Bollepalli Bajibabu, Tsiaras Vassilis, Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2019 ,第6期

机译：GlotNet-统计参数语音合成中声门激励的原始波形模型
4. Improved time-frequency trajectory excitation modeling for a statistical parametric speech synthesis system [C] . Song Eunwoo, Joo Young-Sun, Kang Hong-Goo IEEE International Conference on Acoustics, Speech and Signal Processing . 2015

机译：统计参量语音合成系统的改进时频轨迹激励建模
5. Statistical Parametric Speech Synthesis using Deep Learning Architectures [D] . Kang, Shiyin. 2016

机译：使用深度学习架构的统计参数致辞
6. Discriminative Multi-Stream Postfilters Based on Deep Learning for Enhancing Statistical Parametric Speech Synthesis [O] . Marvin Coto-Jiménez 2021

机译：基于深度学习的判别多流破旧用于增强统计参数致辞综合
7. DIRECTLY MODELING SPEECH WAVEFORMS BY NEURAL NETWORKS FOR STATISTICAL PARAMETRIC SPEECH SYNTHESIS [O] . Keiichi Tokudayz, Heiga Zeny 2015

机译：用神经网络直接模拟语音波形进行统计参数语音合成

Statistical parametric speech synthesis with a novel codebook-based excitation model

摘要

著录项

相似文献

相关主题

期刊订阅