首页> 外文期刊>Knowledge-Based Systems >Audio scene recognition based on audio events and topic model
【24h】

Audio scene recognition based on audio events and topic model

机译:基于音频事件和主题模型的音频场景识别

获取原文
获取原文并翻译 | 示例

摘要

Topic model is a hot research topic which is attracting attentions from many fields. Recently, several studies have applied topic model to ASR (audio scene recognition). Among these studies, most of them use the document-word co-occurrence matrix for topic analysis. In this work, we propose a new ASR algorithm based on audio events and topic model, which uses the document-event co-occurrence matrix for topic analysis. Our work is based on the hypothesis that: for an audio document, compared with its word distribution, its event distribution is more in line with humans' way of thinking, and then the topic distribution obtained based on the document-event co-occurrence matrix can represent the audio document better. The contribution of this work lies in that: (1) we propose an ASR algorithm which uses document-event co-occurrence matrix for topic analysis. Compared with the current studies which use document-word co-occurrence matrix for topic analysis, the proposed algorithm can extract the topic distribution which can express the audio documents better, and then can get better recognition results; (2) we propose a much easier method to obtain the document-event co-occurrence matrix; (3) we propose a method to weight the event distribution of audio documents; this weighting method can emphasize the audio events that are important in reflecting the unique topics of the audio documents, and can suppress the audio events that are common to many topics. Experimental results on two public datasets verify the effectiveness of the proposed ASR algorithm, and also verify the necessity and effectiveness of the proposed weighting method. The innovative ideas in this work are not limited to ASR, but can be extended to many other fields, such as the video classification etc. (C) 2017 Elsevier B.V. All rights reserved.
机译:主题模型是一个热门的研究主题,吸引了众多领域的关注。近来,一些研究已经将主题模型应用于ASR(音频场景识别)。在这些研究中,大多数使用文档词共现矩阵进行主题分析。在这项工作中,我们提出了一种基于音频事件和主题模型的新ASR算法,该算法使用文档事件共生矩阵进行主题分析。我们的工作基于以下假设:对于一个音频文档,与它的单词分布相比,其事件分布更符合人类的思维方式,然后基于文档事件共现矩阵获得主题分布可以更好地表示音频文档。这项工作的贡献在于:(1)我们提出了一种ASR算法,该算法使用文档事件共现矩阵进行主题分析。与目前使用文档词共现矩阵进行主题分析的研究相比,该算法可以提取主题分布,更好地表达音频文档,从而获得更好的识别效果。 (2)我们提出了一种更简单的方法来获得文档事件共现矩阵; (3)提出一种加权音频文件事件分布的方法。这种加权方法可以强调对于反映音频文档的唯一主题很重要的音频事件,并且可以抑制许多主题共有的音频事件。在两个公共数据集上的实验结果验证了所提出的ASR算法的有效性,并验证了所提出的加权方法的必要性和有效性。这项工作中的创新思想不仅限于ASR,还可以扩展到许多其他领域,例如视频分类等。(C)2017 Elsevier B.V.保留所有权利。

著录项

  • 来源
    《Knowledge-Based Systems》 |2017年第1期|1-12|共12页
  • 作者单位

    Shandong Normal Univ, Inst Biomed Sci, Sch Phys & Elect, Shandong Prov Key Lab Med Phys & Image Proc Techn, Jinan 250014, Peoples R China;

    Shandong Normal Univ, Inst Biomed Sci, Sch Phys & Elect, Shandong Prov Key Lab Med Phys & Image Proc Techn, Jinan 250014, Peoples R China;

    Nanchang Hangkong Univ, Sch Informat, Nanchang 330063, Jiangxi, Peoples R China;

    Shandong Coll Elect Technol, Dept Comp Sci & Technol, Jinan 250014, Peoples R China;

    Shandong Normal Univ, Inst Biomed Sci, Sch Phys & Elect, Shandong Prov Key Lab Med Phys & Image Proc Techn, Jinan 250014, Peoples R China;

    Shandong Normal Univ, Inst Biomed Sci, Sch Phys & Elect, Shandong Prov Key Lab Med Phys & Image Proc Techn, Jinan 250014, Peoples R China;

    Univ Jinan, Shandong Prov Key Lab Network Based Intelligent C, Sch Informat Sci & Engn, Jinan 250014, Peoples R China;

    Shandong Normal Univ, Inst Biomed Sci, Sch Phys & Elect, Shandong Prov Key Lab Med Phys & Image Proc Techn, Jinan 250014, Peoples R China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Audio scene recognition; Audio event; Topic model; PLSA; LDA; Support vector machine;

    机译:音频场景识别;音频事件;主题模型;PLSA;LDA;支持向量机;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号