首页> 外文会议>11th International Workshop on Image Analysis for Multimedia Interactive Services >Wikipedia based semantic metadata annotation of audio transcripts
【24h】

Wikipedia based semantic metadata annotation of audio transcripts

机译:基于维基百科的音频转录本的语义元数据注释

获取原文

摘要

A method to automatically annotate video items with semantic metadata is presented. The method has been developed in the context of the Papyrus project to annotate documentary- like broadcast videos with a set of relevant keywords using automatic speech recognition (ASR) transcripts as a primary complementary resource. The task is complicated by the high word error rate (WER) of the ASR for this kind of videos. For this reason a novel relevance criterion based on domain information is proposed. Wikipedia is used both as a source of metadata and as a linguistic resource for disambiguating keywords and for eliminating the out of topic/out of domain keywords. Documents are annotated with relevant links to Wikipedia pages, concepts definitions, synonyms, translations and concepts categories.
机译:提出了一种利用语义元数据自动注释视频项目的方法。该方法是在Papyrus项目的上下文中开发的,以使用自动语音识别(ASR)成绩单作为主要补充资源,使用一组相关关键字来注释类似纪录片的广播视频。对于此类视频,ASR的高单词错误率(WER)使任务变得复杂。为此,提出了一种新的基于领域信息的相关性准则。维基百科既用作元数据的来源,又用作消除歧义关键词和消除主题外/域外关键词的语言资源。文档带有指向Wikipedia页面,概念定义,同义词,翻译和概念类别的相关链接注释。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号