Vocal Emotion Conversion Using WSOLA and Linear Prediction

机译：使用WSOLA和线性预测的人声情绪转换

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The paper deals with speech emotion conversion using Waveform Similarity Overlap Add (WSOLA) and subsequent linear prediction analysis for spectral transformation. Duration modification is done by taking the ratio between segment durations of neutral and target speech. After performing modification using WSOLA, the duration modified source speech is time aligned with target and further subjected to linear prediction analysis to yield the LP coefficients. The target emotion is re-synthesised by using the prosody manipulated residual and LPCs from source. The waveform similarity property of WSOLA is exploited to give output with minimal distortion. The proposed algorithm is subjectively and objectively evaluated along with popular TD-PSOLA algorithm. The correlation between synthesised and real target shows an average improvement of 60% across all emotions with the proposed technique.

机译：本文使用波形相似性重叠加法（WSOLA）进行语音情感转换，并随后进行频谱预测的线性预测分析。通过获取中性语音和目标语音的片段持续时间之间的比率来进行持续时间修改。在使用WSOLA执行修改之后，将经过修改的持续时间的源语音与目标进行时间对齐，然后进行线性预测分析以产生LP系数。通过使用韵律操纵的残差和LPC从源头重新合成目标情感。利用WSOLA的波形相似性，可提供失真最小的输出。该算法与流行的TD-PSOLA算法一起进行了主观和客观的评估。使用所提出的技术，合成目标与真实目标之间的相关性显示出在所有情绪下的平均改善率为60％。

著录项

来源
《International Conference on speech and computer》|2017年|777-787|共11页
会议地点
作者
Susmitha Vekkot; Shikha Tripathi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Emotion; WSOLA; Linear prediction; Dynamic time warping; Comparative mean opinion score; Correlation coefficient;

机译：感情; WSOLA;线性预测;动态时间扭曲;比较平均意见得分;相关系数;

相似文献

外文文献
中文文献
专利

1. Prosodic transformation in vocal emotion conversion for multi-lingual scenarios: a pilot study [J] . Susmitha Vekkot, Deepa Gupta International journal of speech technology . 2019,第3期

机译：多语言情景下语音情感转换中的韵律转换：一项初步研究
2. Do nonlinear vocal phenomena signal negative valence or high emotion intensity? [J] . Andrey Anikin, Katarzyna Pisanski, David Reby Royal Society Open Science . 2020,第12期

机译：非线性声乐现象是否信号负加工或高情感强度？
3. Feature Extraction Using Power-Law Adjusted Linear Prediction With Application to Speaker Recognition Under Severe Vocal Effort Mismatch [J] . Saeidi Rahim, Alku Paavo, Backstrom Tom Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2016,第1期

机译：幂律调整线性预测的特征提取及其在严重人声力度不匹配下的说话人识别中的应用
4. Vocal Emotion Conversion Using WSOLA and Linear Prediction [C] . Susmitha Vekkot, Shikha Tripathi International Conference on Speech and Computer . 2017

机译：使用WSOLA和线性预测转换声乐情绪转换
5. Unified mathematical model for linear and nonlinear viscoelastic predictions of linear monodisperse and polydisperse and branched polymers. [D] . Khaliullin, Renat N. 2010

机译：用于线性单分散和多分散以及支链聚合物的线性和非线性粘弹性预测的统一数学模型。
6. Do nonlinear vocal phenomena signal negative valence or high emotion intensity? [O] . Andrey Anikin, Katarzyna Pisanski, David Reby 2020

机译：非线性声乐现象是否信号负加工或高情感强度？
7. Feasibility of vocal emotion conversion on modulation spectrogram for simulated cochlear implants [O] . Zhi Zhu, Ryota Miyauchi, Yukiko Araki, 2017

机译：模拟耳蜗植入物调制谱图的声乐情绪转化可行性
8. Telecommunications: Analog to Digital Conversion of Radio Voice by 4,800Bit/Second Code Excited Linear Prediction (CELP). Federal Standard 1016 [R] . 1991

机译：电信：4,800Bit /秒码激励线性预测（CELp）的无线电语音模拟到数字转换。联邦标准1016

Vocal Emotion Conversion Using WSOLA and Linear Prediction

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅