首页> 外文会议>International Conference on Instrumentation Measurement, Computer, Communication and Control >An Effective Real-Time Audio Segmentation Method Based on Time-Frequency Energy Analysis
【24h】

An Effective Real-Time Audio Segmentation Method Based on Time-Frequency Energy Analysis

机译:基于时频能量分析的有效实时音频分割方法

获取原文
获取外文期刊封面目录资料

摘要

Audio segmentation is a vital preprocessing step in several audio processing applications. An effective multi-stage real-time audio segmentation method based on time-frequency energy analysis is proposed in this paper. An energy distribution model for different frequency bands is built on Mel frequency domain. In the roughly segmentation stage, the starting or finishing points are estimated based on time domain energy. The frequency domain energy of audio and silence have different characteristics on the energy distribution model. Then, in the exactly segmentation stage, the endpoints are detected based on frequency domain energy. And the strategy of the initialization and dynamic adjustment of the thresholds are described. Experimental results show that this method achieves 3.6% and 7.0% reduction in false alarm rate and missed detection rate compared to GLR-BIC, and 7.7% and 11.5% reduction in false alarm rate and missed detection rate compared to double threshold method. We statistic the audio recognition accuracy of the sentences during 1s~6s and 6s~10s is higher. And the percentage of the sentences segmented by this method is 98% in these durations more than other two methods.
机译:音频分割是几个音频处理应用程序中的重要预处理步骤。本文提出了一种基于时频能量分析的有效的多级实时音频分割方法。在MEL频域中构建了不同频带的能量分布模型。在大致分割阶段,基于时域能量估计起始或精加工。音频和沉默的频域能量对能量分布模型具有不同的特性。然后,在刻针阶段,基于频域能量检测端点。描述了阈值的初始化和动态调整的策略。实验结果表明,与GLR-BIC相比,该方法降低了误报率和错过检出率为7.6%和7.0%,与双阈值方法相比,误报率和错过检测率为7.7%和11.5%。我们统计句子的音频识别准确性在1S〜6S和6S〜10s期间更高。这些方法分割的句子的百分比比其他两种方法在这些持续时间内为98%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号