首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >Quality Preserving Compression of a Concatenative Text-To-Speech Acoustic Database
【24h】

Quality Preserving Compression of a Concatenative Text-To-Speech Acoustic Database

机译:级联文本语音语音数据库的质量保留压缩

获取原文
获取原文并翻译 | 示例

摘要

A concatenative text-to-speech (CTTS) synthesizer requires a large acoustic database for high-quality speech synthesis. This database consists of many acoustic leaves, each containing a number of short, compressed, speech segments. In this paper, we propose two algorithms for recompression of the acoustic database, by recompressing the data in each acoustic leaf, without compromising the perceptual quality of the obtained synthesized speech. This is achieved by exploiting the redundancy between speech frames and speech segments in the acoustic leaf. The first approach is based on a vector polynomial temporal decomposition. The second is based on 3-D shape-adaptive discrete cosine transform (DCT), followed by optimized quantization. In addition we propose a segment ordering algorithm in an attempt to improve overall performance. The developed algorithms are generic and may be applied to a variety of compression challenges. When applied to compressed spectral amplitude parameters of a specific IBM small footprint CTTS database, we obtain a recompression factor of 2 without any perceived degradation in the quality of the synthesized speech.
机译:串联文本到语音(CTTS)合成器需要大型声学数据库才能进行高质量的语音合成。该数据库由许多声叶组成,每个声叶包含许多短而压缩的语音段。在本文中,我们提出了两种用于重新压缩声学数据库的算法,方法是在不损害所获得的合成语音的感知质量的前提下,通过压缩每个声学叶中的数据。这是通过利用声叶中语音帧和语音段之间的冗余来实现的。第一种方法基于矢量多项式时间分解。第二个基于3D形状自适应离散余弦变换(DCT),然后进行优化的量化。另外,我们提出了一种段排序算法,以试图提高整体性能。所开发的算法是通用的,可以应用于各种压缩挑战。当将其应用于特定的IBM小足迹CTTS数据库的压缩频谱幅度参数时,我们获得的重新压缩因子为2,而合成语音的质量没有任何下降。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号