【24h】

Talking faces indexing in TV-content

机译:电视内容中的会说话的人脸索引

获取原文

摘要

Our objective is to index talking faces in a TV-Context: build a description of TV-content, in terms of talking people, without any pre-defined dictionary of identities. In TV-content, because of multi-face shots and non-speaking face shots, it is difficult to determine which face is speaking. In this work, a method is proposed which clusters people independently by the audio and by the visual information and combines these clusterings of people (audio and visual) in order to detect sequences of talking faces. The audio indexing system is based on agglomerative clustering with the Bayesian Information Criterion. The visual indexing system is based on costume detection and clustering of color histograms. The combination of both indexes is based on searching for the best match between both clusterings, to obtain a correspondence between the automatic audio labels and the automatic video labels. The talking faces are then determined by the intersection of the segments of the associated audio and video labels. Results of experiments on a TV-Show database show that a high correct detection rate can be achieved by the proposed method.
机译:我们的目标是为电视上下文中的说话面孔编制索引:以没有说话者身份的预定义词典,以说话人的身份对电视内容进行描述。在电视内容中,由于有多张脸部照片和不说话的脸部照片,因此很难确定正在说哪张脸。在这项工作中,提出了一种方法,该方法通过音频和视觉信息独立地对人进行聚类,并将这些人的聚类(音频和视觉)组合在一起,以检测会说话的面孔的序列。音频索引系统基于贝叶斯信息准则的聚类聚类。视觉索引系统基于服装检测和颜色直方图的聚类。两个索引的组合基于搜索两个聚类之间的最佳匹配,以获取自动音频标签和自动视频标签之间的对应关系。然后,通过相关联的音频和视频标签的各段的交点确定会说话的脸。在TV-Show数据库上进行的实验结果表明,该方法可以实现较高的正确检测率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号