Ensemble of convolutional neural networks to improve animal audio classification

Loris Nanni; Yandre M. G. Costa; Rafael L. Aguiar; Rafael B. Mangolin; Sheryl Brahnam; Carlos N. Silla

摘要

In this work, we present an ensemble for automated audio classification that fuses different types of featuresextracted from audio files. These features are evaluated, compared, and fused with the goal of producing betterclassification accuracy than other state-of-the-art approaches without ad hoc parameter optimization. We present anensemble of classifiers that performs competitively on different types of animal audio datasets using the same set ofclassifiers and parameter settings. To produce this general-purpose ensemble, we ran a large number of experimentsthat fine-tuned pretrained convolutional neural networks (CNNs) for different audio classification tasks (bird, bat, andwhale audio datasets). Six different CNNs were tested, compared, and combined. Moreover, a further CNN, trainedfrom scratch, was tested and combined with the fine-tuned CNNs. To the best of our knowledge, this is the largeststudy on CNNs in animal audio classification. Our results show that several CNNs can be fine-tuned and fused forrobust and generalizable audio classification. Finally, the ensemble of CNNs is combined with handcrafted texturedescriptors obtained from spectrograms for further improvement of performance.

机译：在这项工作中，我们为自动音频分类提供了一个用于自动音频分类的集合，这些音频分类融合了从音频文件的不同类型的特色。这些特征是评估，比较，并融合，并融合生产比其他现实最先进的方法没有临时参数优化的目标。我们呈现了使用相同的Classifiers和参数设置在不同类型的动物音频数据集上执行竞争性的分类器的AneSemble。为了生产这种通用专题，我们为不同的音频分类任务（鸟，蝙蝠，音频数据集）运行了大量的实验普雷雷卷曲神经网络（CNNS）。测试，比较和组合六种不同的CNN。此外，测试并将其训练的另外的CNN进行了测试并与微调CNN结合。据我们所知，这是动物音频分类中CNNS中最小的。我们的结果表明，几个CNN可以进行微调和融合的Forrobust和更广泛的音频分类。最后，CNN的集合与从谱图中获得的手工制作的纹理标记，以进一步提高性能。

Ensemble of convolutional neural networks to improve animal audio classification

摘要

著录项

相关主题

期刊订阅