首页> 外文会议> >Unsupervised Speech/Non-Speech Detection for Automatic Speech Recognition in Meeting Rooms
【24h】

Unsupervised Speech/Non-Speech Detection for Automatic Speech Recognition in Meeting Rooms

机译:会议室自动语音识别的无监督语音/非语音检测

获取原文

摘要

The goal of this work is to provide robust and accurate speech detection for automatic speech recognition (ASR) in meeting room settings. The solution is based on computing long-term modulation spectrum, and examining specific frequency range for dominant speech components to classify speech and non-speech signals for a given audio signal. Manually segmented speech segments, short-term energy, short-term energy and zero-crossing based segmentation techniques, and a recently proposed multi layer perceptron (MLP) classifier system are tested for comparison purposes. Speech recognition evaluations of the segmentation methods are performed on a standard database and tested in conditions where the signal-to-noise ratio (SNR) varies considerably, as in the cases of close-talking headset, lapel, distant microphone array output, and distant microphone. The results reveal that the proposed method is more reliable and less sensitive to mode of signal acquisition and unforeseen conditions
机译:这项工作的目标是为会议室设置中的自动语音识别(ASR)提供可靠而准确的语音检测。该解决方案基于计算长期调制频谱,并检查主要语音分量的特定频率范围,以对给定音频信号进行语音和非语音信号分类。为了进行比较,测试了手动分段的语音分段,短期能量,基于短期能量和零交叉的分段技术以及最近提出的多层感知器(MLP)分类器系统。在标准数据库上执行对分割方法的语音识别评估,并在信噪比(SNR)发生较大变化的条件下进行测试,例如在近距离交谈耳机,翻领,远距离麦克风阵列输出和远距离麦克风的情况下麦克风。结果表明,所提出的方法对信号采集模式和不可预见的条件更加可靠,灵敏度更低。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号