...
首页> 外文期刊>Frontiers in Psychology >On the Acoustics of Emotion in Audio: What Speech, Music, and Sound have in Common
【24h】

On the Acoustics of Emotion in Audio: What Speech, Music, and Sound have in Common

机译:关于音频中的情感声学:语音,音乐和声音有什么共同点

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Without doubt, there is emotional information in almost any kind of sound received by humans every day: be it the affective state of a person transmitted by means of speech; the emotion intended by a composer while writing a musical piece, or conveyed by a musician while performing it; or the affective state connected to an acoustic event occurring in the environment, in the soundtrack of a movie, or in a radio play. In the field of affective computing, there is currently some loosely connected research concerning either of these phenomena, but a holistic computational model of affect in sound is still lacking. In turn, for tomorrow’s pervasive technical systems, including affective companions and robots, it is expected to be highly beneficial to understand the affective dimensions of “the sound that something makes,” in order to evaluate the system’s auditory environment and its own audio output. This article aims at a first step toward a holistic computational model: starting from standard acoustic feature extraction schemes in the domains of speech, music, and sound analysis, we interpret the worth of individual features across these three domains, considering four audio databases with observer annotations in the arousal and valence dimensions. In the results, we find that by selection of appropriate descriptors, cross-domain arousal, and valence regression is feasible achieving significant correlations with the observer annotations of up to 0.78 for arousal (training on sound and testing on enacted speech) and 0.60 for valence (training on enacted speech and testing on music). The high degree of cross-domain consistency in encoding the two main dimensions of affect may be attributable to the co-evolution of speech and music from multimodal affect bursts, including the integration of nature sounds for expressive effects.
机译:毫无疑问,人们每天几乎可以听到任何形式的声音中的情感信息。作曲家在创作音乐作品时要表达的情感,或在演奏音乐作品时由音乐家表达的情感;或与发生在环境中,电影的配乐中或广播中的声音事件有关的情感状态。在情感计算领域,目前对这两种现象中的任何一种都有松散的联系研究,但是仍然缺少声音情感的整体计算模型。反过来,对于明天广泛使用的技术系统(包括情感伴侣和机器人)来说,理解“某物发出的声音”的情感维度以评估系统的听觉环境和自己的音频输出,将是非常有益的。本文旨在朝着整体计算模型迈出第一步:从语音,音乐和声音分析领域的标准声学特征提取方案开始,我们考虑了四个带有观察者的音频数据库,解释了这三个领域中各个特征的价值唤醒和价位维度中的注释。在结果中,我们发现通过选择合适的描述符,跨域唤醒和价回归是可行的,与唤醒者的注解(对声音的训练和对成语语音的测试)和注音的检验值分别高达0.78和0.60可以实现显着相关。 (针对已发表的演讲进行培训,并对音乐进行测试)。编码情感的两个主要维度时,跨域一致性的高度归因于多模态情感突发中语音和音乐的共同进化,包括为表达效果而整合的自然声音。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号