首页> 外文会议>International Conference on Instrumentation Measurement, Computer, Communication and Control >An Effective Real-Time Audio Segmentation Method Based on Time-Frequency Energy Analysis
【24h】

An Effective Real-Time Audio Segmentation Method Based on Time-Frequency Energy Analysis

机译:一种基于时频能量分析的有效的实时音频分割方法

获取原文

摘要

Audio segmentation is a vital preprocessing step in several audio processing applications. An effective multi-stage real-time audio segmentation method based on time-frequency energy analysis is proposed in this paper. An energy distribution model for different frequency bands is built on Mel frequency domain. In the roughly segmentation stage, the starting or finishing points are estimated based on time domain energy. The frequency domain energy of audio and silence have different characteristics on the energy distribution model. Then, in the exactly segmentation stage, the endpoints are detected based on frequency domain energy. And the strategy of the initialization and dynamic adjustment of the thresholds are described. Experimental results show that this method achieves 3.6% and 7.0% reduction in false alarm rate and missed detection rate compared to GLR-BIC, and 7.7% and 11.5% reduction in false alarm rate and missed detection rate compared to double threshold method. We statistic the audio recognition accuracy of the sentences during 1s~6s and 6s~10s is higher. And the percentage of the sentences segmented by this method is 98% in these durations more than other two methods.
机译:音频分段是几种音频处理应用程序中至关重要的预处理步骤。提出了一种有效的基于时频能量分析的多级实时音频分割方法。在梅尔频域上建立了不同频段的能量分布模型。在粗略分割阶段,根据时域能量估算起点或终点。音频和静音的频域能量在能量分布模型上具有不同的特性。然后,在精确分割阶段,根据频域能量检测端点。并描述了阈值初始化和动态调整的策略。实验结果表明,与双阈值方法相比,该方法的误报率和漏检率分别降低了3.6%和7.0%,误报率和漏检率分别降低了7.7%和11.5%。统计1s〜6s和6s〜10s期间句子的语音识别准确率。在这种情况下,用这种方法分割的句子比例比其他两种方法高出98%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号