首页> 外文期刊>Ecological informatics: an international journal on ecoinformatics and computational ecology >Detection and identification of European woodpeckers with deep convolutional neural networks
【24h】

Detection and identification of European woodpeckers with deep convolutional neural networks

机译:深度卷积神经网络的欧洲啄木鸟的检测与识别

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Every spring, European forest soundscapes fill up with the drums and calls of woodpeckers as they draw territories and pair up. Each drum or call is species-specific and easily picked up by a trained ear. In this study, we worked toward automating this process and thus toward making the continuous acoustic monitoring of woodpeckers practical. We recorded from March to May successively in Belgium, Luxemburg and France, collecting hundreds of gigabytes of data. We shed 50-80% of these recordings using the Acoustic Complexity Index (ACI). Then, for both the detection of the target signals in the audio stream and the identification of the different species, we implemented transfer learning from computer vision to audio analysis. This meant transforming sounds into images via spectrograms and retraining legacy deep image networks that have been made public (e.g. Inception) to work with such data. The visual patterns produced by drums (vertical lines) and call syllables (hats, straight lines, waves, etc.) in spectrograms are characteristic and allow an identification of the signals. We retrained using data from Xeno-Canto, Tierstimmen and a private collection. In the subsequent analysis of the field recordings, the repurposed networks gave outstanding results for the detection of drums (either 0.2-9.9% of false positives, or for the toughest dataset, a reduction from 28,601 images to 1000 images left for manual review) and for the detection and identification of calls (73.5-100.0% accuracy; in the toughest case, dataset reduction from 643,901 images to 14,667 images). However, they performed less well for the identification of drums than a simpler method using handcrafted features and the k-Nearest Neighbor (k-NN) classifier. The species character in drums does not lie in shapes but in temporal patterns: speed, acceleration, number of strikes and duration of the drums. These features are secondary information in spectrograms, and the image networks that have learned invariance toward object size tend to disregard them. At locations where they drummed abundantly, the accuracy was 83.0% for Picus canus (93.1% for k-NN) and 36.1% for Dryocopus martins (81.5% for k-NN). For the three field locations we produced time lines of the encountered woodpecker activity (6 species, 11 signals).
机译:每春天,欧洲森林Soundscapes都会填满鼓和啄木鸟的呼叫,因为他们绘制了地区并配对。每个鼓或呼叫都是特定的物种,并且可以通过训练的耳朵轻松拾取。在这项研究中,我们致力于自动化这一过程,从而实现了啄木鸟的连续声监测。我们从3月份录制到比利时,卢森堡和法国的比利时,收集数百千兆字节的数据。我们使用声学复杂性指数(ACI)揭示了50-80%的这些录音。然后,对于音频流中的目标信号的检测和不同物种的识别,我们实现了从计算机视觉到音频分析的转移学习。这意味着通过频谱图和再培训传统的深层图像网络将声音转换为图像,该网络已经公开(例如成立)来处理这些数据。在谱图中由滚筒(垂直线)和呼叫音节(帽子,直线,波等)产生的视觉模式是特征,并允许识别信号。我们使用来自Xeno-Canto,Tierstimmen和私人收藏的数据进行了侦查。在随后的现场录制分析中,重新定位的网络对滚筒的检测(误报的0.2-9.9%,或者对于最艰难的数据集,从28,601张图像减少到手动评论的1000张图像)和用于检测和识别呼叫(精度为73.5-100.0%;在最棘手的情况下,数据集从643,901图像减少到14,667个图像)。然而,它们对识别鼓而言比使用手工特征和k最近邻(K-NN)分类器的更简单方法表现不佳。鼓中的物种特征不在形状中,但在时间模式下:速度,加速度,撞击次数和鼓的持续时间。这些特征是频谱图中的辅助信息,并且与对象大小的不变性的图像网络倾向于忽略它们。在它们丰富地击中的位置,精度为皮带甘蔗的83.0%(k-nn为93.1%),36.1%用于干燥器马丁(K-nn的81.5%)。对于我们产生遇到的啄木鸟活动的时间线(6种,11个信号)的三个现场位置。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号