首页> 美国卫生研究院文献>Frontiers in Psychology >On the Acoustics of Emotion in Audio: What Speech Music and Sound have in Common
【2h】

On the Acoustics of Emotion in Audio: What Speech Music and Sound have in Common

机译:关于音频中的情感声学:语音音乐和声音有什么共同点

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Without doubt, there is emotional information in almost any kind of sound received by humans every day: be it the affective state of a person transmitted by means of speech; the emotion intended by a composer while writing a musical piece, or conveyed by a musician while performing it; or the affective state connected to an acoustic event occurring in the environment, in the soundtrack of a movie, or in a radio play. In the field of affective computing, there is currently some loosely connected research concerning either of these phenomena, but a holistic computational model of affect in sound is still lacking. In turn, for tomorrow’s pervasive technical systems, including affective companions and robots, it is expected to be highly beneficial to understand the affective dimensions of “the sound that something makes,” in order to evaluate the system’s auditory environment and its own audio output. This article aims at a first step toward a holistic computational model: starting from standard acoustic feature extraction schemes in the domains of speech, music, and sound analysis, we interpret the worth of individual features across these three domains, considering four audio databases with observer annotations in the arousal and valence dimensions. In the results, we find that by selection of appropriate descriptors, cross-domain arousal, and valence regression is feasible achieving significant correlations with the observer annotations of up to 0.78 for arousal (training on sound and testing on enacted speech) and 0.60 for valence (training on enacted speech and testing on music). The high degree of cross-domain consistency in encoding the two main dimensions of affect may be attributable to the co-evolution of speech and music from multimodal affect bursts, including the integration of nature sounds for expressive effects.
机译:毫无疑问,人们每天几乎可以听到任何形式的声音中的情感信息:无论是通过语音传达的人的情感状态;作曲家在创作音乐作品时要表达的情感,或在演奏音乐作品时由音乐家表达的情感;或与在环境,电影的配乐或广播中发生的声音事件有关的情感状态。在情感计算领域,目前对这两种现象中的任何一种都有松散的联系研究,但是声音的情感整体计算模型仍然缺乏。反过来,对于明天的无处不在的技术系统(包括情感伴侣和机器人)来说,理解“某物发出的声音”的情感维度以评估系统的听觉环境和其自身的音频​​输出,将有望带来极大的好处。本文旨在朝着整体计算模型迈出第一步:从语音,音乐和声音分析领域的标准声学特征提取方案开始,我们考虑了四个带有观察者的音频数据库,从而解释了这三个领域中各个特征的价值唤醒和价位维度中的注释。在结果中,我们发现通过选择适当的描述符,跨域唤醒和价态回归是可行的,与高达0.78的观察者注释(对声音的训练和对成语语音的测试)和价态的0.60的观察者注释实现显着相关。 (针对已发表的演讲进行培训,并对音乐进行测试)。编码情感的两个主要维度时,跨域一致性的高度归因于多模态情感突发中语音和音乐的共同进化,包括为表达效果而整合的自然声音。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号