Speech Analysis in the Big Data Era

机译：大数据时代的语音分析

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In spoken language analysis tasks, one is often faced with comparably small available corpora of only one up to a few hours of speech material mostly annotated with a single phenomenon such as a particular speaker state at a time. In stark contrast to this, engines such as for the recognition of speakers' emotions, sentiment, personality, or pathologies, are often expected to run independent of the speaker, the spoken content, and the acoustic conditions. This lack of large and richly annotated material likely explains to a large degree the headroom left for improvement in accuracy by todays engines. Yet, in the big data era, and with the increasing availability of crowd-sourcing services, and recent advances in weakly supervised learning, new opportunities arise to ease this fact. In this light, this contribution first shows the de-facto standard in terms of data-availability in a broad range of speaker analysis tasks. It then introduces highly efficient 'cooperative' learning strategies basing on the combination of active and semi-supervised alongside transfer learning to best exploit available data in combination with data synthesis. Further, approaches to estimate meaningful confidence measures in this domain are suggested, as they form (part of) the basis of the weakly supervised learning algorithms. In addition, first successful approaches towards holistic speech analysis are presented using deep recurrent rich multi-target learning with partially missing label information. Finally, steps towards needed distribution of processing for big data handling are demonstrated.

机译：在口语分析任务中，人们经常会遇到相对较小的可用语料库，该语料库只有一个到几个小时的语音材料，并且大多数情况下都会用一种现象（例如一次特定的说话者状态）进行注释。与此形成鲜明对比的是，通常期望诸如识别说话者的情绪，情感，个性或病态之类的引擎独立于说话者，说话内容和听觉条件而运行。缺少大量且注释丰富的材料很可能在很大程度上解释了当今发动机为提高精度留出的净空。然而，在大数据时代，随着众包服务的可用性不断提高，以及在弱监督学习方面的最新进展，出现了缓解这一事实的新机会。有鉴于此，此贡献首先显示了在广泛的说话人分析任务中的数据可用性方面的事实上的标准。然后，它基于主动学习和半监督学习以及转移学习的结合，引入了高效的“合作”学习策略，以结合数据综合来最佳地利用可用数据。此外，由于该方法构成了弱监督学习算法的基础（一部分），因此建议了在这一领域中评估有意义的置信度的方法。此外，使用深度递归丰富的多目标学习（部分缺少标签信息），提出了进行整体语音分析的第一个成功方法。最后，展示了实现大数据处理所需的处理分布的步骤。

著录项

来源
《International conference on text, speech and dialogue》|2015年|3-11|共9页
会议地点
作者
Bjoern W. Schuller;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Speech analysis; Paralinguistics; Big data; Self-learning;

机译：语音分析;副语言学;大数据;自我学习;

相似文献

外文文献
中文文献
专利

1. Speech therapy after thyroid gland operations in Germany: analysis of routine data from 50,676 AOK patients [J] . Maneck M., Dotzenrath C., Dralle H., Der Chirurg; Zeitschrift fur alle Gebiete der operativen Medizen . 2019,第3期

机译：德国甲状腺手术后的言语疗法：分析50,676名AOK患者的常规数据
2. Efficient online target speech extraction using DOA-constrained independent component analysis of stereo data for robust speech recognition [J] . Minook Kim, Hyung-Min Park Signal processing . 2015,第deca期

机译：使用DOA约束的立体声数据独立分量分析进行有效的在线目标语音提取，以实现可靠的语音识别
3. Data-based analysis of speech and gesture: the Bielefeld Speech and Gesture Alignment corpus (SaGA) and its applications [J] . Andy Lücking, Kirsten Bergman, Florian Hahn, Journal on Multimodal User Interfaces . 2013,第1a2期

机译：基于数据的语音和手势分析：Bielefeld语音和手势对齐语料库（SaGA）及其应用
4. Performance analysis of several pitch detection algorithms on simulated and real noisy speech data [C] . Denis Jouvet, Yves Laprie European Signal Processing Conference . 2017

机译：几种基音检测算法在模拟和真实噪声语音数据上的性能分析
5. Software tools and analysis methods for the use of electromagnetic articulography data in speech research [D] . Kolb, Andrew J. 2015

机译：在语音研究中使用电磁关节造影数据的软件工具和分析方法
6. Functional near-infrared spectroscopy for speech protocols: characterization of motion artifacts and guidelines for improving data analysis [O] . Sergio L. Novi, Erin Roberts, Danielle Spagnuolo, 2020

机译：用于语音协议的功能性近红外光谱：运动伪影的表征和改善数据分析的指南
7. Statistical Analysis of fNIRS Data: Consideration of Spatial Varying Coefficient Model of Prefrontal Cortex Activity Changes During Speech Motor Learning in Apraxia of Speech [O] . Rachel Johnson, Jennifer Matthews, Norou Diawara, 2020

机译：FNIRS数据的统计分析：言语电机学习中前额叶皮质活动变化空间变化系数模型的考虑
8. Speech Analysis and Synthesis and Man-Machine Speech Communications for AirOperations. (Synthese et Analyse de la Parole et Liaisons Vocales Homme-Machine dans les Operations Aeriennes) [R] . 1990

机译：airOperations的语音分析与综合及人机语音通信。（synthese et analyze de la parole et Liaisons Vocales Homme-machine dans les Operations aeriennes）

Speech Analysis in the Big Data Era

摘要

著录项

相似文献

相关主题

期刊订阅