首页> 外文会议>International Conference on Text, Speech and Dialogue >Voice Activity Detector (VAD) Based on Long-Term Mel Frequency Band Features
【24h】

Voice Activity Detector (VAD) Based on Long-Term Mel Frequency Band Features

机译:语音活动探测器(VAD)基于长期MEL频段功能

获取原文

摘要

We propose a VAD using long-term 200 ms Mel frequency band statistics, auditory masking, and a pre-trained two level decision tree ensemble based classifier, which allows capturing syllable level structure of speech and discriminating it from common noises. Proposed algorithm demonstrates on the test dataset almost 100% acceptance of clear voice for English, Chinese, Russian, and Polish speech and 100% rejection of stationary noises independently of loudness. The algorithm is aimed to be used as a trigger for ASR. It reuses short-term FFT analysis (STFFT) from ASR frontend with additional 2KB memory and 15% complexity overhead.
机译:我们提出了一种使用长期200 MS MEL频段统计,听觉掩蔽和预先训练的两个级别决策树合奏的分类器的VAD,其允许捕获语音的音节水平结构并从共同的噪声中鉴别它。建议的算法表现出测试数据集几乎100%接受英语,中文,俄语和波兰语言论的清晰声音,100%独立于响度拒绝固定噪声。该算法旨在用作ASR的触发器。它从ASR前端重用短期FFT分析(STFFT),额外的2KB内存和15%的复杂性开销。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号