首页> 外文会议>IEEE International Conference on Multimedia and Expo >PERCEPTUAL-MVDR BASED ANALYSIS-SYNTHESIS OF PITCH SYNCHRONOUS FRAMES FOR PITCH MODIFICATION
【24h】

PERCEPTUAL-MVDR BASED ANALYSIS-SYNTHESIS OF PITCH SYNCHRONOUS FRAMES FOR PITCH MODIFICATION

机译:基于感知的MVDR分析 - 俯仰修改的音调同步框架的合成

获取原文

摘要

In our earlier work [1, 2], we employed minimum variance distortionless response (MVDR) and MVDR Bauer respectively, as spectral estimation techniques in place of modified-linear prediction in Discrete cosine transform (DCT) based pitch modification [3]. As a general extension, we introduce psychoacoustic characteristics to [1, 2] resulting in Perceptual-MVDR (PMVDR) and PMVDR-Bauer algorithms utilized here for spectral estimation. Further, we employ Bauer method of spectral factorization in our later algorithm since it results in causal inverse filter. These are used to obtain residual signal from pitch synchronous speech frames. The residual signal is resampled using DCT/IDCT depending on the target pitch scale factor. Finally, forward filters realized from the above factorization are used to get pitch modified speech. The modified speech is evaluated subjectively by 10 listeners and mean opinion scores (MOS) are evaluated for pitch factors from 0.5 to 2. Modified bark spectral distortion (MBSD) measure is also employed to evaluate objective performance. We found that the proposed approach has been rated with higher MOS and has achieved lower MBSD than the time domain pitch synchronous overlap [4], modified-LP method [3] and MVDR based methods [1, 2]. Further, we modified the pitch contours of 20 affirmative sentences to sound like interrogative sentences, using the current as well as our earlier algorithms and compared their performance.
机译:在我们之前的工作[1,2]中,我们分别使用了最小方差失真响应(MVDR)和MVDR鲍尔,作为基于离散余弦变换(DCT)的间距变换(DCT)的变形线性预测的光谱估计技术[3]。作为一般的延伸,我们将心理声学特性引入[1,2]导致的感知-MVDR(PMVDR)和这里用于光谱估计的PMVDR-BAUER算法。此外,我们在稍后的算法中使用频谱分解的Bauer方法,因为它导致因果逆滤波器。这些用于从间距同步语音帧获得残留信号。根据目标音调比例因子,使用DCT / IDCt重新采样残差信号。最后,从上述分子化实现的前向滤波器用于获得音调修改的语音。通过10个听众对修改的语音进行了主观评估,并且评估了从0.5到2的音调因子评估了平均意见分数(MOS)。改性的树皮光谱失真(MBSD)测量也用于评估客观性能。我们发现,所提出的方法已被较高的MOS额定值,并且已经比时域间距同步重叠[4],改进-LP方法[3]和MVDR基础方法[1,2]达到了较低的MBSD。此外,我们修改了20个肯定句子的音高轮廓,以便像疑问句一样,使用当前的识别句,以及我们之前的算法并进行比较它们的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号