...
首页> 外文期刊>Frontiers in Psychology >Affective Voice Interaction and Artificial Intelligence: A Research Study on the Acoustic Features of Gender and the Emotional States of the PAD Model
【24h】

Affective Voice Interaction and Artificial Intelligence: A Research Study on the Acoustic Features of Gender and the Emotional States of the PAD Model

机译:情感语音互动与人工智能:性别声学特征的研究与垫模型的情绪状态

获取原文
   

获取外文期刊封面封底 >>

       

摘要

New types of artificial intelligence products are gradually transferring to voice interaction modes with the demand for intelligent products expanding from communication to recognizing users' emotions and instantaneous feedback. At present, affective acoustic models are constructed through deep learning and abstracted into a mathematical model, making computers learn from data and equipping them with prediction abilities. Although this method can result in accurate predictions, it has a limitation in that it lacks explanatory capability; there is an urgent need for an empirical study of the connection between acoustic features and psychology as the theoretical basis for the adjustment of model parameters. Accordingly, this study focuses on exploring the differences between seven major “acoustic features” and their physical characteristics during voice interaction with the recognition and expression of “gender” and “emotional states of the pleasure-arousal-dominance (PAD) model.” In this study, 31 females and 31 males aged between 21 and 60 were invited using the stratified random sampling method for the audio recording of different emotions. Subsequently, parameter values of acoustic features were extracted using Praat voice software. Finally, parameter values were analyzed using a Two-way ANOVA, mixed-design analysis in SPSS software. Results show that gender and emotional states of the PAD model vary among seven major acoustic features. Moreover, their difference values and rankings also vary. The research conclusions lay a theoretical foundation for AI emotional voice interaction and solve deep learning's current dilemma in emotional recognition and parameter optimization of the emotional synthesis model due to the lack of explanatory power.
机译:新型的人工智能产品逐步转移到语音交互模式,了解从沟通扩展到识别用户情绪和瞬时反馈的智能产品的需求。目前,情感声学模型通过深度学习构建并抽象成数学模型,使计算机从数据中学习并配备预测能力。虽然这种方法可能导致准确的预测,但它有一个限制,因为它缺乏解释能力;迫切需要对声学特征和心理学之间的连接的实证研究作为调整模型参数的理论依据。因此,本研究侧重于探索七大“声学特征”与其在语音互动期间与“性别”和“快乐 - 令人愉快的主导地位(PAD)模型的情绪状态”的互动之间的差异。“在本研究中,使用分层随机抽样方法邀请31名女性和31名男性在21至60岁之间,用于不同情绪的音频记录。随后,使用Praat语音软件提取声学特征的参数值。最后,使用SPSS软件的双向ANOVA,混合设计分析进行分析参数值。结果表明,垫模型的性别和情绪状态在七个主要声学特征之间变化。此外,它们的差值和排名也有所不同。研究结论为AI情绪语音互动的理论基础,并解决了深入学习的情绪识别困境和情绪综合模型的参数优化因缺乏解释性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号