首页> 外文会议> >Searching Audio-Visual Clips for Dual-mode Chinese Emotional Speech Database

【24h】

Searching Audio-Visual Clips for Dual-mode Chinese Emotional Speech Database

机译：在双模式中文情感语音数据库中搜索视听剪辑

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

A widely accepted Chinese emotional speech database with abundant spontaneous speeches is essential to Chinese emotional speech recognition and affective computing. This paper presents a new method of constructing such a Chinese audio-visual spontaneous emotional speech database. The source materials come from a variety of videos in Chinese. The Voice Activity Detection technology is introduced to catch the sets of start time and end time of the syntactic boundaries in a dialogue. This times sets are helpful in following extracting processing to ensure reaching a complete phrase or sentence. Microsoft Emotion API is adopted to compute the confidence across a set of eight emotional states in frame-level from videos. A joint compression-discrimination algorithm is presented to detecting which clip would be accepted as the candidate and which emotion state it mostly be. Manual listening test and modification is implemented finally. The data analysis shows that the proposed method is feasible and effective.

机译：广泛接受的具有丰富自发语音的中文情感语音数据库对于中文情感语音识别和情感计算至关重要。本文提出了一种构建这样的中文视听自发情感语音数据库的新方法。原始资料来自各种中文视频。引入了语音活动检测技术以在对话中捕获语法边界的开始时间和结束时间。该时间集有助于后续提取过程，以确保达到完整的短语或句子。采用Microsoft Emotion API可以从视频的帧级别计算一组八个情绪状态的置信度。提出了一种联合压缩-区分算法，以检测哪个片段将被接受为候选片段，以及该片段主要处于哪种情感状态。最终进行了人工收听测试和修改。数据分析表明，该方法是可行和有效的。

著录项

来源
《》|2018年|1-6|共6页
会议地点 Beijing(CN)
作者
Xudong Zhang; Guoqing Wu; Fuji Ren;
展开▼
作者单位

Faculty of Engineering / School of Electronics and Information Engineering, Tokushima University / Nantong University, Tokushima, Japan / Jiangsu, China;

School of Electronics and Information Engineering, Nantong University, Jiangsu, China;

Faculty of Engineering, Tokushima University, Tokushima, Japan;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
Videos; Databases; Speech recognition; Streaming media; Affective computing; Syntactics; Manuals;

机译：视频；数据库；语音识别；流媒体；情感计算；句法；手册；;
入库时间 2022-08-26 14:26:36

相似文献

外文文献
中文文献
专利

1. CHEAVD: a Chinese natural emotional audio-visual database [J] . Li Ya, Tao Jianhua, Chao Linlin, Journal of ambient intelligence and humanized computing . 2017,第6期

机译：CHEAVD：中国自然的情感视听数据库
2. Recognizing emotional speech in Persian: Avalidated database of Persian emotional speech (Persian ESD) [J] . Niloofar Keshtiari, Michael Kuhlmann, Moharram Eslami, Behavior Research Methods . 2015,第1期

机译：在波斯语中识别情绪讲话：波斯情感演讲的被培养数据库（波斯岛ESD）
3. Erratumto: Recognizing emotional speech in Persian: Avalidated database of Persian emotional speech (Persian ESD) [J] . Niloofar Keshtiari, Michael Kuhlmann, Moharram Eslami, Behavior Research Methods . 2015,第1期

机译：误诊：识别波斯语中的情感演讲：波斯情感语音（波斯岛ESD）的被培养的数据库
4. Searching Audio-Visual Clips for Dual-mode Chinese Emotional Speech Database [C] . Xudong Zhang, Guoqing Wu, Fuji Ren Asian Conference on Affective Computing and Intelligent Interaction . 2018

机译：搜索用于双模中国情绪语音数据库的视听剪辑
5. Robust speech processing based on microphone array, audio-visual, and frame selection for in-vehicle speech recognition and in-set speaker recognition. [D] . Zhang, Xianxian. 2005

机译：基于麦克风阵列，视听和帧选择的强大语音处理功能，可实现车载语音识别和内置说话人识别。
6. Do gender differences in audio-visual benefit and visual influence in audio-visual speech perception emerge with age? [O] . Magnus Alm, Dawn Behne -1

机译：随着年龄的增长视听利益中的性别差异和视听语音感知中的视觉影响是否会出现？
7. Erratum to: Recognizing emotional speech in Persian: A validated database of Persian emotional speech (Persian ESD) [O] . Niloofar Keshtiari, Michael Kuhlmann, Moharram Eslami, 2014

机译：错误：识别波斯语中的情绪演讲：波斯情感演讲的验证数据库（波斯ESD）

Searching Audio-Visual Clips for Dual-mode Chinese Emotional Speech Database

摘要

著录项

相似文献

相关主题

期刊订阅