首页> 外文OA文献 >Audio content analysis in the presence of overlapped classes : a non-exclusive segmentation approach to mitigate information losses
【2h】

Audio content analysis in the presence of overlapped classes : a non-exclusive segmentation approach to mitigate information losses

机译:存在重叠类时的音频内容分析:一种非排他的分割方法,可减轻信息丢失

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Soundtracks of multimedia files are information rich, from which much content-related metadata can be extracted. There is a pressing demand for automated classification, identification and information mining of audio content. A segment of the audio soundtrack can be either speech, music, event sounds or a combination of them.There exist many individual algorithms for the recognition and analysis of speech, music or event sounds, allowing for embedded information to be retrieved in a semantic fashion. A systematic review shows that a universal system that is optimised to extract the maximum amount of information for further text mining and inference does not exist. Mainstream algorithms typically work with a single class of sound, e.g. speech, music or even sounds and classification methods are predominantly exclusive (detects one class at a time) and losing much of information when two or three classes are overlapped. udA universal open architecture for audio content and scene analysis has been proposed by the authors. To mitigate information losses in overlapped content, non-exclusive segmentation approaches were adopted. This paper is presented from one possible implementation deploying the universal open architecture as a paradigm to show how the universal open architecture can integrate existing methods and workflow but maximise extractable semantic information. udIn the current work, overlapped content is identified and segmented from carefully tailored feature spaces and a family of decision trees are used to generate a content score. Results show that the developed system, when compared with well established audio content analysers, can identify and thus extract information from much more speech and music segments. The full paper will discuss the methods, detail the results and illustrate how the system works.
机译:多媒体文件的音轨信息丰富,可以从中提取许多与内容相关的元数据。迫切需要对音频内容进行自动分类,识别和信息挖掘。音频声轨的一部分可以是语音,音乐,事件声音或它们的组合。存在许多用于识别和分析语音,音乐或事件声音的单独算法,允许以语义方式检索嵌入的信息。 。一项系统的审查表明,不存在经过优化以提取最大量信息以进行进一步文本挖掘和推理的通用系统。主流算法通常使用单一类别的声音,例如语音,音乐甚至声音和分类方法主要是排他性的(一次检测一个类别),并且当两个或三个类别重叠时会丢失很多信息。作者已经提出了一种用于音频内容和场景分析的通用开放式体系结构。为了减轻重叠内容中的信息丢失,采用了非排他的分割方法。本文从部署通用开放体系结构作为范例的一种可能的实现中介绍,以展示通用开放体系结构如何集成现有方法和工作流,同时最大化可提取的语义信息。 ud在当前工作中,重叠的内容被识别并从精心定制的特征空间中进行分割,并且使用决策树族来生成内容分数。结果表明,与完善的音频内容分析仪相比,开发的系统可以识别并从更多语音和音乐片段中提取信息。全文将讨论这些方法,详细介绍结果并说明系统如何工作。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号