首页> 外文期刊>ACM Transactions on Interactive Intelligent Systems >A Human-in-the-Loop System for Sound Event Detection and Annotation
【24h】

A Human-in-the-Loop System for Sound Event Detection and Annotation

机译:声音事件检测和注释的人在环系统

获取原文
获取原文并翻译 | 示例
       

摘要

Labeling of audio events is essential for many tasks. However, finding sound events and labeling them within a long audio file is tedious and time-consuming. In cases where there is very little labeled data (e.g., a single labeled example), it is often not feasible to train an automatic labeler because many techniques (e.g., deep learning) require a large number of human-labeled training examples. Also, fully automated labeling may not show sufficient agreement with human labeling for many uses. To solve this issue, we present a human-in-the-loop sound labeling system that helps a user quickly label target sound events in a long audio. It lets a user reduce the time required to label a long audio file (e.g., 20 hours) containing target sounds that are sparsely distributed throughout the recording (10% or less of the audio contains the target) when there are too few labeled examples (e.g., one) to train a state-of-the-art machine audio labeling system. To evaluate the effectiveness of our tool, we performed a human-subject study. The results show that it helped participants label target sound events twice as fast as labeling them manually. In addition to measuring the overall performance of the proposed system, we also measure interaction overhead and machine accuracy, which are two key factors that determine the overall performance. The analysis shows that an ideal interface that does not have interaction overhead at all could speed labeling by as much as a factor of four.
机译:音频事件的标签对于许多任务至关重要。但是,找到声音事件并在一个长的音频文件中标记它们很繁琐且耗时。在标记的数据很少的情况下(例如单个标记的示例),训练自动标记器通常是不可行的,因为许多技术(例如深度学习)需要大量的人类标记的训练示例。同样,对于许多用途,全自动标记可能无法与人类标记充分吻合。为解决此问题,我们提出了一种在环声音标签系统,可帮助用户快速标记长音频中的目标声音事件。当标记的示例太少时,它可以让用户减少标记包含在整个录音中稀疏分布的目标声音的长音频文件(例如20小时)所需的时间(10%或更少的音频包含目标)(例如)训练最先进的机器音频标签系统。为了评估我们工具的有效性,我们进行了一项人体研究。结果表明,它帮助参与者标记目标声音事件的速度是手动标记目标声音事件的两倍。除了测量所提出系统的整体性能之外,我们还测量了交互开销和机器精度,这是决定整体性能的两个关键因素。分析表明,完全没有交互开销的理想接口可以使标记速度提高多达四倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号