首页> 美国卫生研究院文献>PLoS Clinical Trials >Scalable preprocessing of high volume environmental acoustic data for bioacoustic monitoring
【2h】

Scalable preprocessing of high volume environmental acoustic data for bioacoustic monitoring

机译:可扩展的大量环境声数据预处理,以进行生物声监测

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

In this work, we examine the problem of efficiently preprocessing and denoising high volume environmental acoustic data, which is a necessary step in many bird monitoring tasks. Preprocessing is typically made up of multiple steps which are considered separately from each other. These are often resource intensive, particularly because the volume of data involved is high. We focus on addressing two challenges within this problem: how to combine existing preprocessing tasks while maximising the effectiveness of each step, and how to process this pipeline quickly and efficiently, so that it can be used to process high volumes of acoustic data. We describe a distributed system designed specifically for this problem, utilising a master-slave model with data parallelisation. By investigating the impact of individual preprocessing tasks on each other, and their execution times, we determine an efficient and accurate order for preprocessing tasks within the distributed system. We find that, using a single core, our pipeline executes 1.40 times faster compared to manually executing all preprocessing tasks. We then apply our pipeline in the distributed system and evaluate its performance. We find that our system is capable of preprocessing bird acoustic recordings at a rate of 174.8 seconds of audio per second of real time with 32 cores over 8 virtual machines, which is 21.76 times faster than a serial process.
机译:在这项工作中,我们研究了有效预处理和去除大量环境声数据的问题,这是许多鸟类监测任务中的必要步骤。预处理通常由彼此分开考虑的多个步骤组成。这些通常占用大量资源,特别是因为涉及的数据量很大。我们专注于解决此问题中的两个挑战:如何在最大程度地提高每个步骤的有效性的同时合并现有的预处理任务,以及如何快速有效地处理此管道,以便可以将其用于处理大量声学数据。我们描述了一个专门针对此问题设计的分布式系统,它利用具有数据并行化的主从模型。通过调查各个预处理任务彼此之间的影响及其执行时间,我们确定了分布式系统中预处理任务的有效和准确顺序。我们发现,使用单核,与手动执行所有预处理任务相比,我们的管道执行速度提高了1.40倍。然后,我们将管道应用到分布式系统中并评估其性能。我们发现,我们的系统能够在8台虚拟机上使用32个内核,以每秒每秒174.8秒的音频速度预处理鸟类的声音记录,这比串行处理的速度快21.76倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号