...
首页> 外文期刊>Circuits, systems, and signal processing >STRAIGHT-Based Emotion Conversion Using Quadratic Multivariate Polynomial
【24h】

STRAIGHT-Based Emotion Conversion Using Quadratic Multivariate Polynomial

机译:基于二次多元多项式的基于直觉的情感转换

获取原文
获取原文并翻译 | 示例

摘要

Speech is the natural mode of communication and the easiest way of expressing human emotions. Emotional speech is expressed in terms of features like f0 contour, intensity, speaking rate, and voice quality. The group of these features is called prosody. Generally, prosody is modified by pitch and time scaling. Emotional speech conversion is more sensitive to prosody unlike voice conversion, where spectral conversion is the main concern. Several techniques, linear as well as nonlinear, have been used for transforming the speech. Our hypothesis is that quality of emotional speech conversion can be improved by estimating nonlinear relationship between the neutral and emotional speech feature vectors. In this research work, quadratic multivariate polynomial (QMP) has been explored for transforming neutral speech to emotional target speech. Both subjective and objective analyses were carried out to evaluate the transformed emotional speech using comparison mean opinion scores (CMOS), mean opinion scores (MOS), identification rate, root-mean-square error, and Mahalanobis distance. For Toronto emotional database, except for neutral/sad conversion, the CMOS analysis indicates that the transformed speech can partly be perceived as target emotion. Moreover, the MOS and spectrogram indicate good quality of transformed speech. For German database except for neutral/boredom conversion, the CMOS value of proposed technique has better score than gross and initial-middle-final methods but less than syllable method. However, QMP technique is simple, is easy to implement, has better quality of transformed speech, and estimates transformation function using limited number of utterances of training set.
机译:语音是自然的交流方式,也是表达人类情感的最简单方法。情感言语以f0轮廓,强度,语速和语音质量等特征表示。这些功能组称为韵律。通常,韵律通过音调和时间缩放来修改。与语音转换不同,情感语音转换对韵律更敏感,而语音转换则主要关注频谱转换。线性和非线性的几种技术已用于转换语音。我们的假设是,通过估计中性和情感语音特征向量之间的非线性关系,可以提高情感语音转换的质量。在这项研究工作中,已经探索了二次多元多项式(QMP)将中性语音转换为情感目标语音。使用比较平均意见评分(CMOS),平均意见评分(MOS),识别率,均方根误差和马氏距离来进行主观和客观分析,以评估转换后的情感言语。对于多伦多情绪数据库,除了中性/悲伤转换外,CMOS分析表明,转换后的语音可以部分地视为目标情绪。此外,MOS和频谱图表明转换后的语音质量良好。对于德国数据库,除了中性/无聊转换之外,所提出技术的CMOS值比总和初始-中间-最终方法得分更高,但比音节方法更低。然而,QMP技术简单,易于实施,具有更高的语音转换质量,并且使用有限数量的训练集发音来估计转换功能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号