首页> 外文期刊>Computer vision and image understanding >Measuring novelty and redundancy with multiple modalities in cross-lingual broadcast news
【24h】

Measuring novelty and redundancy with multiple modalities in cross-lingual broadcast news

机译:跨语言广播新闻中采用多种方式衡量新颖性和冗余性

获取原文
获取原文并翻译 | 示例

摘要

News videos from different channels, languages are broadcast everyday, which provide abundant information for users. To effectively search, retrieve, browse and track news stories, news story similarity plays a critical role in assessing the novelty and redundancy among news stories. In this paper, we explore different measures of novelty and redundancy detection for cross-lingual news stories. A news story is represented by multimodal features which include a sequence of keyframes in the visual track, and a set of words and named entities extracted from speech transcript in the audio track. Vector space models and language models on individual features (text, named entities and keyframes) are constructed to compare the similarity among stories. Furthermore, multiple modalities are further fused to improve the performance. Experiments on the TRECVID-2005 cross-lingual news video corpus showed that modalities and measures demonstrate variant performance for novelty and redundancy detection. Language models on text are appropriate for detecting completely redundant stories, while Cosine Distance on keyframes is suitable for detecting somewhat redundant stories. The performance on mono-lingual topics is better than multilingual topics. Textual features and visual features complement each other, and fusion of text, named entities and keyframes substantially improves the performance, which outperforms approaches with just individual features.
机译:每天播放来自不同渠道,语言的新闻视频,为用户提供丰富的信息。为了有效地搜索,检索,浏览和跟踪新闻报道,新闻报道相似性在评估新闻报道的新颖性和冗余性方面起着至关重要的作用。在本文中,我们探索了针对跨语言新闻报道的新颖性和冗余检测的不同措施。新闻报道由多模式特征表示,这些特征包括视觉轨道中的一系列关键帧,以及从音频轨道中的语音记录中提取的一组单词和命名实体。构建单个特征(文本,命名实体和关键帧)上的向量空间模型和语言模型,以比较故事之间的相似性。此外,进一步融合了多种模式以提高性能。在TRECVID-2005跨语言新闻视频语料库上进行的实验表明,这些方法和措施证明了新颖性和冗余检测的变体性能。文本上的语言模型适用于检测完全冗余的故事,而关键帧上的余弦距离适用于检测有些冗余的故事。单语言主题的表现要优于多语言主题。文字特征和视觉特征相辅相成,文本,命名实体和关键帧的融合大大提高了性能,其性能优于仅具有单个特征的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号