首页> 外文学位 >Improving high quality concatenative text-to-speech synthesis using the circular linear prediction model.

【24h】

Improving high quality concatenative text-to-speech synthesis using the circular linear prediction model.

机译：使用圆形线性预测模型改善高质量的串联文本到语音合成。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Current high quality text-to-speech (TTS) systems are based on unit selection from a large database that is both contextually and prosodically rich. These systems, albeit capable of natural voice quality, are computationally expensive and require a very large footprint. Their success is attributed to the dramatic reduction of storage costs in recent times. However, for many TTS applications a smaller footprint is becoming a standard requirement. This thesis presents a new method for representing speech segments that can improve the quality and/or reduce the footprint current concatenative TTS systems. The circular linear prediction (CLP) model is revisited and combined with the constant pitch transform (CPT) to provide a robust representation of speech signals that allows for limited prosodic movements without a perceivable loss in quality. The CLP model assumes that each frame of voiced speech is an infinitely periodic signal. This assumption allows for LPC modeling using the covariance method, with the efficiency of the autocorrelation method. The CPT is combined with this model to provide a database that is uniform in pitch for matching the target prosody during synthesis. With this representation, limited prosody modifications and unit concatenation can be performed without causing audible artifacts. For resolving artifacts caused by pitch modifications in voicing transitions, a method has been introduced for reducing peakiness in the LP spectra by constraining the line spectral frequencies. Two experiments have been conducted to demonstrate the potential for the capabilities of CLP/CPT method. The first is a listening test to determine the ability of this model to realize prosody modifications without perceivable degradation. Utterances are resynthesized using the CLP/CPT method with emphasized prosodics to increase intelligibility in harsh environments. The second experiment compares the quality of utterances synthesized by unit-selection based limited-domain TTS against the CLP/CPT method. The results demonstrate that the CLP/CPT representation, applied to current concatenative TTS systems, can reduce the size of the database and increase the prosodic richness without noticeable degradation in voice quality.

机译：当前的高质量文本语音转换（TTS）系统基于从上下文和韵律丰富的大型数据库中选择的单元。这些系统尽管具有自然的语音质量，但在计算上却很昂贵，并且占用空间非常大。他们的成功归因于近来存储成本的大幅度降低。但是，对于许多TTS应用而言，较小的占用空间已成为标准要求。本文提出了一种表示语音片段的新方法，该方法可以提高质量和/或减少占用空间的串联TTS系统。重新讨论了圆形线性预测（CLP）模型，并将其与恒定音高变换（CPT）结合使用，以提供语音信号的可靠表示，从而可以实现有限的韵律运动而不会造成质量上的损失。 CLP模型假设有声语音的每个帧都是一个无限周期的信号。该假设允许使用协方差方法进行LPC建模，并具有自相关方法的效率。 CPT与该模型结合以提供一个音高均匀的数据库，以在合成过程中匹配目标韵律。通过这种表示，可以执行有限的韵律修改和单元连接，而不会引起可听见的伪影。为了解决在音调过渡中由音调变化引起的伪影，已经引入了一种通过限制线谱频率来减小LP谱中峰值的方法。已经进行了两个实验，以证明CLP / CPT方法功能的潜力。第一个是听力测试，用于确定该模型实现韵律修改而不会引起可察觉的退化的能力。使用CLP / CPT方法和强调韵律的韵律来重新合成说话，以提高恶劣环境下的清晰度。第二个实验比较了基于单元选择的有限域TTS与CLP / CPT方法合成的话语质量。结果表明，将CLP / CPT表示应用于当前的串联TTS系统，可以减小数据库的大小并增加韵律丰富度，而语音质量不会明显下降。

著录项

作者
Shukla, Sunil Ravindra.;
展开▼
作者单位

Georgia Institute of Technology.;

展开▼
授予单位 Georgia Institute of Technology.;
学科 Engineering Electronics and Electrical.
学位 Ph.D.
年度 2007
页码 158 p.
总页数 158
原文格式 PDF
正文语种 eng
中图分类无线电电子学、电信技术;
关键词

相似文献

外文文献
中文文献
专利

1. On-line classification of coal combustion quality using nonlinear SVM for improved neural network NOx emission rate prediction [J] . Jacob F. Turtle, Landen D. Blackburn, Kody M. Powell Computers & Chemical Engineering . 2020,第Octa4期

机译：非线性SVM改进神经网络NOx排放率预测非线性SVM的在线分类
2. An Improved Taguchi Algorithm Based on Fitting and Prediction for Linear Antenna Array Synthesis [J] . Xu Xiaomin, Liao Cheng, Cheng Youfeng, International journal of antennas and propagation . 2019,第PTa2期

机译：一种改进的基于拟合和预测线性天线阵列合成的Taguchi算法
3. An Improved Taguchi Algorithm Based on Fitting and Prediction for Linear Antenna Array Synthesis [J] . Xiaomin Xu, Cheng Liao, Youfeng Cheng, International journal of antennas and propagation . 2019,第1期

机译：一种改进的基于拟合和预测线性天线阵列合成的Taguchi算法
4. Perceptually Based automatic Prosody Labeling and Prosodically Enriched Unit Selection Improve Concatenative Text-to-speech Synthesis [C] . Colin W.Wightman, Ann K.syrda, Georg Stemmer, 6th International Conference on Spoken Language Processing ICSLP 2000 Oct.16-Oct.20 2000 Beijing International Convention Center, Beijing, China . 2000

机译：基于感知的自动韵律标记和具有韵律的单元选择可改进级联文本到语音的合成
5. Effects of birth weight, finishing feeder design, and dietary astaxanthin and ractopamine HCl on the growth, carcass, and pork quality characteristics of pigs; and meta-analyses to improve the prediction of pork fat quality [D] . Bergstrom, Jonathan Robert 2011

机译：出生体重，饲喂器设计，日粮虾青素和盐酸莱克多巴胺对猪的生长，cas体和猪肉品质特性的影响；和荟萃分析可改善对猪肉脂肪质量的预测
6. 495 Prediction of pork loin quality using online computer vision system and artificial intelligence model. [O] . X Sun, J Young, J Liu, 2018

机译：495使用在线计算机视觉系统和人工智能模型预测猪腰肉质量。
7. LPCNET: Improving Neural Speech Synthesis through Linear Prediction [O] . Jean-Marc Valin, Jan Skoglund 2019

机译：LPCNET：通过线性预测改善神经语音合成
8. SELECTED METHODS FOR IMPROVING SYNTHESIS SPEECH QUALITY USING LINEAR PREDICTIVE CODING:SYSTEM DESCRIPTION, COEFFICIENT SMOOTHING AND STREAK [R] . Steven Frank Boll 1974

机译：使用线性预测编码提高合成语音质量的选择方法：系统描述，系数平滑和节拍

Improving high quality concatenative text-to-speech synthesis using the circular linear prediction model.

摘要

著录项

相似文献

相关主题

期刊订阅