Identification of Target Speech Utterances from Real Public Conversation

机译：从真实的公共对话中识别目标言语表达

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We are developing a conversation support system that can estimate the smooth progress of human-to-human conversation. When the system senses there has been little progress in the conversation, it attempts to provide a topic to lead a smoother discussion and good atmosphere. The conversation atmosphere is estimated using the fundamental frequency (FO) and sound power (SP). In its practical use, the following problems occur: 1. Ambient noises, especially nonstationary speech signals of a person behind the target speaker, decrease the conversation-atmosphere estimation rate. It is difficult to cancel this speech noise, even when using current noise cancelling methods. 2. Laughter utterances in which acoustic characteristics are quite different from usual speech utterances are often seen in daily conversation, which causes a decrease in the conversation-atmosphere estimation performance. In this paper, we propose an identification method for target speech utterances from ambient speech noises or laughter utterances using the standard deviation value of SP and Mel-Frequency Cepstral Coefficients (MFCC).

机译：我们正在开发一个对话支持系统，该系统可以估计人与人之间对话的顺利进行。当系统感觉到对话进展不大时，它会尝试提供一个话题来引导更流畅的讨论和良好的氛围。使用基频（FO）和声功率（SP）估算对话气氛。在其实际使用中，会出现以下问题：1.环境噪声，尤其是目标说话者身后的人的非平稳语音信号，会降低会话气氛估计率。即使使用当前的噪声消除方法，也难以消除这种语音噪声。 2.在日常会话中经常会看到声音特征与通常的语音发音完全不同的笑声，这导致会话气氛估计性能下降。在本文中，我们提出了一种使用SP和梅尔频率倒谱系数（MFCC）的标准偏差值从环境语音噪声或笑声中识别出目标语音的方法。

著录项

来源
《International Conference on Digital Human Modeling and Applications in Health, Safety, Ergonomics and Risk Management;International Conference on Human-Computer Interaction》|2020年|52-63|共12页
会议地点
作者
Naoto Kosaka; Yumi Wakita;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Conversation support system; Ambient speech cancelling; Laughter utterance identification;

机译：对话支持系统;环境语音消除;笑声发声识别;

相似文献

外文文献
中文文献
专利

1. Short Utterance Based Speech Language Identification in Intelligent Vehicles With Time-Scale Modifications and Deep Bottleneck Features [J] . Ma Zhanyu, Yu Hong, Chen Wei, IEEE Transactions on Vehicular Technology . 2019,第1期

机译：具有时标修改和深瓶颈特征的智能汽车中基于短说话的语音语言识别
2. Identification of Noisy Utterance Speech Signal using GA-Based Optimized 2D-MFCC Method and a Bispectrum Analysis [J] . Benyamin Kusumoputro, Agus Buono, Li Na Journal of Software Engineering and Applications . 2012,第12期

机译：基于遗传算法的优化二维二维MFCC方法和双谱分析识别语音说话语音
3. Publicly Available Online Tool Facilitates Real-Time Monitoring Of Vaccine Conversations And Sentiments [J] . Bahk Chi Y., Cumming Melissa, Paushter Louisa, Health affairs . 2016,第2期

机译：公开可用的在线工具有助于对疫苗对话和情绪进行实时监控
4. Evaluating target utterance identification method using practical free conversation [C] . Naoto Kosaka, Yumi Wakita IEEE International Conference on Artificial Intelligence in Engineering and Technology . 2020

机译：实用的自由对话评估目标话语识别方法
5. Direct non-linear acoustic and elastic inversion: Towards fundamentally new comprehensive and realistic target identification. [D] . Zhang, Haiyan. 2006

机译：直接非线性声学和弹性反演：从根本上寻求新的全面而现实的目标识别。
6. Conversation Electrified: ERP Correlates of Speech Act Recognition in Underspecified Utterances [O] . Rosa S. Gisladottir, Dorothee J. Chwilla, Stephen C. Levinson -1

机译：对话电动化：ERP与言语行为识别在未指定言语中的关联
7. Conversation electrified: ERP correlates of speech act recognition in underspecified utterances [O] . Gísladóttir, R.S., Chwilla, D.J., Levinson, S.C. 2015

机译：对话充满电：在言语不足的情况下语音行为识别的ERP相关性

Identification of Target Speech Utterances from Real Public Conversation

摘要

著录项

相似文献

相关主题

期刊订阅