首页> 外文会议>Annual conference of the International Speech Communication Association >Prof-Life-Log: Audio Environment Detection for Naturalistic Audio Streams
【24h】

Prof-Life-Log: Audio Environment Detection for Naturalistic Audio Streams

机译:Prof-Life-Log:用于自然音频流的音频环境检测

获取原文

摘要

In this study, we develop a new system for real world audio environment matching. Environment detection within unknown audio streams requires a system that operates in an unsupervised manner since it will be faced with unknown environments without prior information. In addition, the overall solution should be computationally efficient for large audio collection. In the proposed approach, a Gaussian mixture model(GMM) is trained on large amounts of unlabeled audio data and used as a background acoustic model. Subsequently, an acoustic signature vector (ASV) is computed for each environment. Here, the ASV vector is designed to capture the unique acoustic characteristics of an environment. Using the ASV vectors, we demonstrate that it is possible to compute an effective similarity measure between two acoustic environments. We demonstrate the performance of the proposed system on real-world audio data, and compare it to a traditional GMM-UBM (Universal Background Model) system. Experiments show that our system achieves an equal error rate (EER) that is +35% better than a baseline GMM-UBM system.
机译:在这项研究中,我们开发了一种用于现实世界音频环境匹配的新系统。未知音频流中的环境检测要求系统以不受监督的方式运行,因为它将在没有先验信息的情况下面临未知环境。另外,对于大型音频收集,总体解决方案应在计算上有效。在提出的方法中,对大量未标记的音频数据进行训练的高斯混合模型(GMM)用作背景声学模型。随后,针对每个环境计算声学特征向量(ASV)。在此,ASV矢量旨在捕获环境的独特声学特性。使用ASV向量,我们证明可以计算两个声学环境之间的有效相似度。我们演示了所提出的系统在实际音频数据上的性能,并将其与传统的GMM-UBM(通用背景模型)系统进行了比较。实验表明,我们的系统实现的平均错误率(EER)比基准GMM-UBM系统高+ 35%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号