Quality Preserving Compression of a Concatenative Text-To-Speech Acoustic Database

Shoham T.; Malah D.; Shechtman S.

首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >Quality Preserving Compression of a Concatenative Text-To-Speech Acoustic Database

【24h】

Quality Preserving Compression of a Concatenative Text-To-Speech Acoustic Database

机译：级联文本语音语音数据库的质量保留压缩

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

A concatenative text-to-speech (CTTS) synthesizer requires a large acoustic database for high-quality speech synthesis. This database consists of many acoustic leaves, each containing a number of short, compressed, speech segments. In this paper, we propose two algorithms for recompression of the acoustic database, by recompressing the data in each acoustic leaf, without compromising the perceptual quality of the obtained synthesized speech. This is achieved by exploiting the redundancy between speech frames and speech segments in the acoustic leaf. The first approach is based on a vector polynomial temporal decomposition. The second is based on 3-D shape-adaptive discrete cosine transform (DCT), followed by optimized quantization. In addition we propose a segment ordering algorithm in an attempt to improve overall performance. The developed algorithms are generic and may be applied to a variety of compression challenges. When applied to compressed spectral amplitude parameters of a specific IBM small footprint CTTS database, we obtain a recompression factor of 2 without any perceived degradation in the quality of the synthesized speech.

机译：串联文本到语音（CTTS）合成器需要大型声学数据库才能进行高质量的语音合成。该数据库由许多声叶组成，每个声叶包含许多短而压缩的语音段。在本文中，我们提出了两种用于重新压缩声学数据库的算法，方法是在不损害所获得的合成语音的感知质量的前提下，通过压缩每个声学叶中的数据。这是通过利用声叶中语音帧和语音段之间的冗余来实现的。第一种方法基于矢量多项式时间分解。第二个基于3D形状自适应离散余弦变换（DCT），然后进行优化的量化。另外，我们提出了一种段排序算法，以试图提高整体性能。所开发的算法是通用的，可以应用于各种压缩挑战。当将其应用于特定的IBM小足迹CTTS数据库的压缩频谱幅度参数时，我们获得的重新压缩因子为2，而合成语音的质量没有任何下降。

著录项

来源
《Audio, Speech, and Language Processing, IEEE Transactions on》 |2012年第3期|p.1056-1068|共13页
作者
Shoham T.; Malah D.; Shechtman S.;
展开▼
作者单位

Dept. of Electr. Eng., Technion - Israel Inst. of Technol., Haifa, Israel;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Acoustic leaf compression; concatenative text-to-speech (CTTS); discrete cosine transform (DCT); shape-adaptive DCT (SADCT); temporal decomposition (TD);

机译：声叶压缩;连续文本转语音（CTTS）;离散余弦变换（DCT）;形状自适应DCT（SADCT）;时间分解（TD）;

相似文献

外文文献
中文文献
专利

1. Speech Database Design for a Concatenative Text-to-Speech Synthesis System for Individuals with Communication Disorders [J] . AKEMI IIDA, NICK CAMPBELL International journal of speech technology . 2003,第4期

机译：沟通障碍者级联文本语音转换系统的语音数据库设计
2. An Approach to Proper Speech Segmentation for Quality Improvement in Concatenative Text-To-Speech System for Indian Languages [J] . SANGHAMITRA MOHANTY, SUMAN BHATTACHARYA, SUMIT BOSE, International Journal of Computer Processing of Oriental Languages . 2005,第1期

机译：适当的语音分割方法以提高印度语言的级联文本转语音系统的质量
3. F0 Gradient Model for Acoustic Quality and FO Consistency of Concatenative TTS [J] . Ryuki TACHIBANA, Tohru NAGANO, Masafumi NISHIMURA 電子情報通信学会技術研究報告. 音声. Speech . 2007,第406期

机译：串联TTS的声音质量和FO一致性的F0梯度模型。
4. Database Mining for Flexible Concatenative Text-to-Speech [C] . Eide, E.M., Fernandez, . 2007

机译：灵活的串联文本到语音的数据库挖掘
5. Improving high quality concatenative text-to-speech synthesis using the circular linear prediction model. [D] . Shukla, Sunil Ravindra. 2007

机译：使用圆形线性预测模型改善高质量的串联文本到语音合成。
6. Effects of Compression on Speech Acoustics Intelligibility and Sound Quality [O] . Pamela E. Souza 2002

机译：压缩对语音声学清晰度和声音质量的影响
7. Assessment and correction of voice quality variabilities in large speech databases for concatenative speech synthesis [O] . Yannis Stylianou 1999

机译：评估和校正用于连接语音合成的大型语音数据库中的语音质量变化
8. Updated Deep-Tow Acoustics/Geophysics System Compressional Velocity Database [R] . Rowe, M. M., Gettrust, J. F. 1994

机译：更新了Deep-Tow声学/地球物理系统压缩速度数据库

Quality Preserving Compression of a Concatenative Text-To-Speech Acoustic Database

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅