【24h】

Few-Shot Continual Learning for Audio Classification

机译:用于音频分类的几秒钟不断学习

获取原文

摘要

Supervised learning for audio classification typically imposes a fixed class vocabulary, which can be limiting for real-world applications where the target class vocabulary is not known a priori or changes dynamically. In this work, we introduce a few-shot continual learning framework for audio classification, where we can continuously expand a trained base classifier to recognize novel classes based on only few labeled data at inference time. This enables fast and interactive model updates by end-users with minimal human effort. To do so, we leverage the dynamic few-shot learning technique and adapt it to a challenging multi-label audio classification scenario. We incorporate a recent state-of-the-art audio feature extraction model as a backbone and perform a comparative analysis of our approach on two popular audio datasets (ESC-50 and AudioSet). We conduct an in-depth evaluation to illustrate the complexities of the problem and show that, while there is still room for improvement, our method outperforms three baselines on novel class detection while maintaining its performance on base classes.
机译:用于音频分类的监督学习通常强制固定类词汇,这可以限制目标类词汇的实际应用,其中目标类词汇表未知先验或动态变化。在这项工作中,我们为音频分类介绍了几次连续学习框架,在那里我们可以连续扩展训练的基本分类器,以基于推理时间仅少量标记的数据识别新颖的类。这使得最终用户可以快速和交互式的模型更新,以最小的人力努力。为此,我们利用动态的少量射击学习技术,并使其适应挑战的多标签音频分类方案。我们将最近的最先进的音频特征提取模型纳入骨干,并对我们的方法进行比较分析,在两个流行的音频数据集(ESC-50和Audioset)上。我们进行了深入的评估,以说明问题的复杂性并表明,虽然仍有改进的空间,但我们的方法优于三种基线上的三个基线,同时保持其在基础类上的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号