首页> 外文期刊>Engineering Applications of Artificial Intelligence >Audio content analysis for unobtrusive event detection in smart homes
【24h】

Audio content analysis for unobtrusive event detection in smart homes

机译:音频内容分析可用于智能家居中的干扰事件检测

获取原文
获取原文并翻译 | 示例

摘要

Environmental sound signals are multi-source, heterogeneous, and varying in time. Many systems have been proposed to process such signals for event detection in ambient assisted living applications. Typically, these systems use feature extraction, selection, and classification. However, despite major advances, several important questions remain unanswered, especially in real-world settings. This paper contributes to the body of knowledge in the field by addressing the following problems for ambient sounds recorded in various real-world kitchen environments: (1) which features and which classifiers are most suitable in the presence of background noise? (2) what is the effect of signal duration on recognition accuracy? (3) how do the signal-to-noise-ratio and the distance between the microphone and the audio source affect the recognition accuracy in an environment in which the system was not trained? We show that for systems that use traditional classifiers, it is beneficial to combine gammatone frequency cepstral coefficients and discrete wavelet transform coefficients and to use a gradient boosting classifier. For systems based on deep learning, we consider 1D and 2D Convolutional Neural Networks (CNN) using mel-spectrogram energies and mel-spectrograms images as inputs, respectively, and show that the 2D CNN outperforms the 1D CNN. We obtained competitive classification results for two such systems. The first one, which uses a gradient boosting classifier, achieved an F1-Score of 90.2% and a recognition accuracy of 91.7%. The second one, which uses a 2D CNN with mel-spectrogram images, achieved an F1-Score of 92.7% and a recognition accuracy of 96%.
机译:环境声音信号是多源,异构且随时间变化的。已经提出了许多系统来处理这种信号以便在环境辅助的生活应用中进行事件检测。通常,这些系统使用特征提取,选择和分类。然而,尽管取得了重大进展,但仍存在一些重要问题尚未解决,尤其是在现实环境中。本文通过解决以下各种现实厨房环境中记录的环境声音问题,为该领域的知识体系做出了贡献:(1)在存在背景噪声的情况下,哪些功能和哪些分类器最合适? (2)信号持续时间对识别精度有何影响? (3)在未经训练的环境中,信噪比和麦克风与音频源之间的距离如何影响识别精度?我们表明,对于使用传统分类器的系统,将伽马通频率倒谱系数和离散小波变换系数结合起来并使用梯度增强分类器是有益的。对于基于深度学习的系统,我们考虑分别使用梅尔谱图能量和梅尔谱图图像作为输入的1D和2D卷积神经网络(CNN),并表明2D CNN优于1D CNN。我们获得了两个此类系统的竞争性分类结果。第一个使用梯度增强分类器的F1-Score为90.2%,识别精度为91.7%。第二个使用带有Mel谱图图像的2D CNN,F1-Score达到92.7%,识别精度达到96%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号