首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Impact of Sound Duration and Inactive Frames on Sound Event Detection Performance
【24h】

Impact of Sound Duration and Inactive Frames on Sound Event Detection Performance

机译:声音持续时间和非活动帧对声音事件检测性能的影响

获取原文

摘要

In many methods of sound event detection (SED), a segmented time frame is regarded as one data sample to model training. The durations of sound events greatly depend on the sound event class, e.g., the sound event "fan" has a long duration, whereas the sound event "mouse clicking" is instantaneous. Thus, the difference in the duration between sound event classes results in a serious data imbalance in SED. Moreover, most sound events tend to occur occasionally; therefore, there are many more inactive time frames of sound events than active frames. This also causes a severe data imbalance between active and inactive frames. In this paper, we investigate the impact of sound duration and inactive frames on SED performance by introducing four loss functions, such as simple reweighting loss, inverse frequency loss, asymmetric focal loss, and focal batch Tversky loss. Then, we provide insights into how we tackle this imbalance problem.
机译:在许多声音事件检测(SED)的方法中,分段时间框被视为一个数据样本以模拟培训。 声音事件的持续时间大大依赖于声音事件类,例如,声音事件“粉丝”的持续时间很长,而声音事件“鼠标点击”是瞬时的。 因此,声音事件类之间的持续时间的差异导致SED中的严重数据不平衡。 此外,大多数声音事件偶尔会发生; 因此,与活动帧有更多的非活动时间帧。 这也会导致主动和非活动帧之间的严重数据不平衡。 在本文中,我们通过引入四个损耗功能,调查声持续时间和非活动帧对SED性能的影响,例如简单的重新重量损失,逆频丢失,不对称焦损和焦点批量电视丢失。 然后,我们提供洞察力如何解决这种不平衡问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号