首页> 外文会议>International Conference on speech and computer >Vocal Emotion Conversion Using WSOLA and Linear Prediction
【24h】

Vocal Emotion Conversion Using WSOLA and Linear Prediction

机译:使用WSOLA和线性预测的人声情绪转换

获取原文

摘要

The paper deals with speech emotion conversion using Waveform Similarity Overlap Add (WSOLA) and subsequent linear prediction analysis for spectral transformation. Duration modification is done by taking the ratio between segment durations of neutral and target speech. After performing modification using WSOLA, the duration modified source speech is time aligned with target and further subjected to linear prediction analysis to yield the LP coefficients. The target emotion is re-synthesised by using the prosody manipulated residual and LPCs from source. The waveform similarity property of WSOLA is exploited to give output with minimal distortion. The proposed algorithm is subjectively and objectively evaluated along with popular TD-PSOLA algorithm. The correlation between synthesised and real target shows an average improvement of 60% across all emotions with the proposed technique.
机译:本文使用波形相似性重叠加法(WSOLA)进行语音情感转换,并随后进行频谱预测的线性预测分析。通过获取中性语音和目标语音的片段持续时间之间的比率来进行持续时间修改。在使用WSOLA执行修改之后,将经过修改的持续时间的源语音与目标进行时间对齐,然后进行线性预测分析以产生LP系数。通过使用韵律操纵的残差和LPC从源头重新合成目标情感。利用WSOLA的波形相似性,可提供失真最小的输出。该算法与流行的TD-PSOLA算法一起进行了主观和客观的评估。使用所提出的技术,合成目标与真实目标之间的相关性显示出在所有情绪下的平均改善率为60%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号