首页> 外文会议>IEE conference on telecommunications >Non-linear prototype waveform interpolation for voiced speech encoding
【24h】

Non-linear prototype waveform interpolation for voiced speech encoding

机译:浊音语音编码的非线性原型波形插值

获取原文
获取外文期刊封面目录资料

摘要

Prototype waveform interpolation (PWI) is a practical and promising coding technique applicable to voiced speech. The waveform and duration of only one pitch cycle (the prototype) per frame is extracted and coded using LPC techniques. Segments of missing speech between the prototypes are reconstructed at the receiver by interpolation from the decoded prototype waveforms. Although waveform reconstruction may not be very accurate over the interpolated segments, suprisingly good speech quality can be achieved at bit rates in the region of 2.5 to 3.5 kb/s using frames of about 20 ms duration, provided the prototype waveforms and pitch periodicity are satisfactorily reproduced. For reasons of low complexity and bit rate, most reported work on PWI uses linear interpolation methods with linear interpolation functions, but these suffer from inherent difficulties in reproducing non-uniform variations in pitch cycle waveforms and lengths. It is shown that nonlinear techniques can improve the representation of voiced speech in interpolated segments without significantly increasing bit rates. Pitch structure is improved by using a temporal differential rate codebook for transmission of small differences in the duration of pitch cycles. Waveform fidelity is improved by deriving optimal combination coefficients (OCC) which determine the composition of each pitch cycle waveform in terms of the given prototypes at segment boundaries. The OCC vectors allow for nonlinear variation in waveform composition and are vector quantised for transmission.
机译:原型波形插值(PWI)是一种适用于浊音语音的实用且有前途的编码技术。使用LPC技术提取和编码每帧的一个间距周期(原​​型)的波形和持续时间。通过从解码的原型波形中插值,在接收器中重建原型之间缺失语音的段。尽管在内插段中,波形重建可能不是非常精确的,但是在令人满意的原型波形和音调周期度,可以以2.5至3.5kb / s的比特率以2.5至3.5kb / s的比特率实现假想良好的语音质量。转载。出于低复杂性和比特率的原因,PWI的大多数报告的工作都使用具有线性插值功能的线性插值方法,但是这些难以在俯仰周期波形和长度中再均匀变化来遭受固有的困难。结果表明,非线性技术可以改善内插段中的浊音语音的表示,而不会显着增加比特率。通过使用时间差分速率码本来改善俯仰结构,以便在音高循环持续时间内传输小的差异。通过导出最佳组合系数(OCC)来改善波形保真度,该系数在段边界处的给定原型方面确定每个俯仰循环波形的组成。 OCC矢量允许波形组合物中的非线性变化,并且是向量传输量的矢量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号