首页> 外文会议>2011 Joint Workshop on Hands-free Speech Communication and Microphone Arrays >Audio spatio-temporal fingerprints for cloudless real-time hands-free diarization on mobile devices
【24h】

Audio spatio-temporal fingerprints for cloudless real-time hands-free diarization on mobile devices

机译:音频时空指纹可在移动设备上实现无云实时免提区分

获取原文

摘要

In this paper, we propose a new low bit rate representation of a sound field and a new method for the corresponding cloudless low delay hands-free diarization suitable for low-performance mobile devices, e.g. mobile phones. The proposed audio spatio-temporal fingerprint representation results in low bit rate (500 bytes/second), however contains complete information about continuous audio tracking of multiple acoustic sources in an open, unconstrained environment. The core of the algorithm is based on simultaneous multiple data stream processing using audio spatio-temporal fingerprint representation to cover higher level events relevant for diarization, e.g. turns, interruptions, crosstalk, speech and non-speech segments. Performance levels achieved to date on 5 hours of hand-labelled datasets have shown the feasibility of the approach at the same time as resulting in 7.58% CPU load on 1-core ultra-low-power mobile processor running at 1 GHz and low algorithmic delay of 112 ms.
机译:在本文中,我们提出了一种新的声场低比特率表示方法,以及一种适用于低性能移动设备(例如,低功耗移动设备)的对应的无云低延迟免提数字化的新方法。手机。所提出的音频时空指纹表示导致较低的比特率(500字节/秒),但是包含有关在开放,不受约束的环境中对多个声源进行连续音频跟踪的完整信息。该算法的核心是基于同时进行的多个数据流处理,该处理使用音频时空指纹表示来覆盖与数字化相关的更高级别的事件,例如:转弯,打断,串扰,语音和非语音片段。迄今为止,在5个小时的手工标记数据集上所达到的性能水平已经证明了该方法的可行性,因为该方法可在运行于1 GHz的1核超低功耗移动处理器上实现7.58%的CPU负载,并且算法延迟低的112毫秒。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号