Low bit-rate voice compression based on frequency domain interpolative techniques

Bhaskar U.; Swaminathan K.

首页> 外文期刊>IEEE transactions on audio, speech and language processing >Low bit-rate voice compression based on frequency domain interpolative techniques

【24h】

Low bit-rate voice compression based on frequency domain interpolative techniques

机译：基于频域内插技术的低比特率语音压缩

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

This paper presents an approach, referred to as frequency domain interpolation (FDI), for achieving high-quality speech at low bit-rates (4 kb/s and below) within reasonable complexity and delay. FDI methods, like the prototype waveform interpolation (PWI) methods, derive a prototype waveform (PW) at regular intervals of time. But, unlike PWI, there is no separation into a slowly evolving waveform (SEW) and a rapidly evolving waveform (REW) component. Instead, the PW is encoded after gain normalization in magnitude-phase form. The magnitude is modeled as a sum of mean and deviation values in multiple frequency bands and this model is quantized using switched backward adaptive VQ techniques. The phase information is represented as a composite vector of PW correlations in multiple frequency bands and an overall voicing measure. This information is quantized using a VQ at the encoder. At the decoder, a phase model is employed that uses the received phase (and magnitude) information to reproduce PWs with the correct periodicity and evolutionary characteristics. Speech is synthesized by interpolating the reconstructed PWs after gain adjustment and filtering it using the short-term predictor and a postfilter. The design of a 4-kb/s and a 2.4-kb/s FDI codec are presented in this paper and their performance is characterized in terms of delay, complexity, and subjective voice quality. The results confirm that FDI techniques have the potential for delivering high-quality speech at low bit-rates in a cost-effective manner.

机译：本文提出了一种称为频域内插（FDI）的方法，该方法可在合理的复杂度和延迟范围内以低比特率（4 kb / s及以下）实现高质量语音。像原型波形插值（PWI）方法一样，FDI方法会以固定的时间间隔导出原型波形（PW）。但是，与PWI不同，没有分离为缓慢发展的波形（SEW）和快速发展的波形（REW）分量。取而代之的是，在增益归一化之后以幅度相位形式对PW进行编码。将幅度建模为多个频带中平均值和偏差值的总和，并使用切换后向自适应VQ技术对该模型进行量化。相位信息表示为多个频带中PW相关性的复合矢量和总体发声方式。该信息在编码器中使用VQ进行量化。在解码器处，采用相位模型，该模型使用接收到的相位（和幅度）信息来再现具有正确的周期性和进化特征的PW。通过在增益调整后对重构的PW进行插值并使用短期预测器和后置滤波器对其进行滤波，可以合成语音。本文介绍了一种4-kb / s和2.4-kb / s FDI编解码器的设计，并根据延迟，复杂性和主观语音质量来表征其性能。结果证实，外国直接投资技术具有以低成本高效率传送低比特率高质量语音的潜力。

著录项

来源
《IEEE transactions on audio, speech and language processing》 |2006年第2期|p.558-576|共19页
作者
Bhaskar U.; Swaminathan K.;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词
decoding; frequency-domain analysis; interpolation; speech codecs; speech coding; vector quantisation; vocoders; decoder; frequency domain interpolative techniques; low bit-rate voice compression; switched backward adaptive VQ; vector quantization; voice codec; Frequ;

机译：解码;频域分析;插值;语音编解码器;语音编码;矢量量化;声码器;解码器;频域插值技术;低比特率语音压缩;切换后向自适应VQ;矢量量化;语音编解码器;频率;

相似文献

外文文献
中文文献
专利

1. A robust, scalable, object-based video compression technique forvery low bit-rate coding [J] . Talluri R., Oehler K., Barmon T., IEEE Transactions on Circuits and Systems for Video Technology . 1997,第1期

机译：强大，可扩展，基于对象的视频压缩技术，可实现非常低的比特率编码
2. A robust, scalable, object-based video compression technique for very low bit-rate coding [J] . Talluri R., Oehler K. IEEE Transactions on Circuits and Systems for Video Technology . 1997,第1期

机译：强大的，可扩展的，基于对象的视频压缩技术，用于非常低的比特率编码
3. Errors in channel prediction based on linear prediction in the frequency domain: A combination of frequency-domain and time-domain techniques [J] . Shotaro Ozawa, Sofyan Tan, Akira Hirose URSI Radio Science Bulletin . 2011,第2期

机译：基于频域线性预测的信道预测中的误差：频域和时域技术的组合
4. Adaptive DCT-Domain Down-Sampling and Learning Based Mapping for Low Bit-Rate Image Compression [C] . Zhongbo Shi, Xiaoyan Sun, Feng Wu . 2009

机译：自适应DCT域下采样和基于学习的低比特率图像压缩映射
5. Advanced Vector Quantization Techniques for Image Compression in the Spatial and Frequency Domain. [D] . Bhatti, Ali Tariq. 2017

机译：用于空间和频域中图像压缩的高级矢量量化技术。
6. Dual Optical Sensor for Oxygen and Temperature Based on the Combination of Time Domain and Frequency Domain Techniques [O] . Hung Lam, Govind Rao, Joanna Loureiro, -1

机译：用于氧气和温度的双光学传感器基于时域和频域技术的组合
7. Adaptive DCT-Domain Down-Sampling and Learning Based Mapping for Low Bit-Rate Image Compression [O] . Zhongbo Shi, Xiaoyan Sun, Feng Wu 2011

机译：自适应DCT域下采样和基于学习的低比特率图像压缩映射
8. Automatic Control System Synthesis Techniques. Phase I. Survey of Frequency Domain Synthesis Techniques Based on the Roots of the Characteristic Equation [R] . Duke, J. C. 1967

机译：自动控制系统综合技术。第一阶段基于特征方程根的频域合成技术研究

Low bit-rate voice compression based on frequency domain interpolative techniques

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅