【24h】

Identification of Target Speech Utterances from Real Public Conversation

机译:从真实的公共对话中识别目标言语表达

获取原文

摘要

We are developing a conversation support system that can estimate the smooth progress of human-to-human conversation. When the system senses there has been little progress in the conversation, it attempts to provide a topic to lead a smoother discussion and good atmosphere. The conversation atmosphere is estimated using the fundamental frequency (FO) and sound power (SP). In its practical use, the following problems occur: 1. Ambient noises, especially nonstationary speech signals of a person behind the target speaker, decrease the conversation-atmosphere estimation rate. It is difficult to cancel this speech noise, even when using current noise cancelling methods. 2. Laughter utterances in which acoustic characteristics are quite different from usual speech utterances are often seen in daily conversation, which causes a decrease in the conversation-atmosphere estimation performance. In this paper, we propose an identification method for target speech utterances from ambient speech noises or laughter utterances using the standard deviation value of SP and Mel-Frequency Cepstral Coefficients (MFCC).
机译:我们正在开发一个对话支持系统,该系统可以估计人与人之间对话的顺利进行。当系统感觉到对话进展不大时,它会尝试提供一个话题来引导更流畅的讨论和良好的氛围。使用基频(FO)和声功率(SP)估算对话气氛。在其实际使用中,会出现以下问题:1.环境噪声,尤其是目标说话者身后的人的非平稳语音信号,会降低会话气氛估计率。即使使用当前的噪声消除方法,也难以消除这种语音噪声。 2.在日常会话中经常会看到声音特征与通常的语音发音完全不同的笑声,这导致会话气氛估计性能下降。在本文中,我们提出了一种使用SP和梅尔频率倒谱系数(MFCC)的标准偏差值从环境语音噪声或笑声中识别出目标语音的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号