【24h】

TALKING FACES INDEXING IN TV-CONTENT

机译:在电视内容中讨论面部索引

获取原文

摘要

Our objective is to index talking faces in a TV-Context: build a description of TV-content, in terms of talking people, without any pre-defined dictionary of identities. In TV-content, because of multi-face shots and non-speaking face shots, it is difficult to determine which face is speaking. In this work, a method is proposed which clusters people independently by the audio and by the visual information and combines these clusterings of people (audio and visual) in order to detect sequences of talking faces. The audio indexing system is based on agglomerative clustering with the Bayesian Information Criterion. The visual indexing system is based on costume detection and clustering of color histograms. The combination of both indexes is based on searching for the best match between both clusterings, to obtain a correspondence between the automatic audio labels and the automatic video labels. The talking faces are then determined by the intersection of the segments of the associated audio and video labels. Results of experiments on a TV-Show database show that a high correct detection rate can be achieved by the proposed method.
机译:我们的目标是在电视上下文中展示谈话面孔:在没有任何预定义的身份字典的情况下,建立电视内容的描述。在电视内容中,由于多面射击和非言论拍摄,很难确定哪个脸正在讲述。在这项工作中,提出了一种通过音频和视觉信息独立群众群体的方法,并结合这些人(音频和视觉)的这些群集以检测谈话面的序列。音频索引系统基于与贝叶斯信息标准的凝聚聚类。视觉索引系统基于服装检测和颜色直方图的聚类。两个索引的组合基于搜索两个群集之间的最佳匹配,以获得自动音频标签和自动视频标签之间的对应关系。然后通过相关音频和视频标签的段的交叉来确定谈话的面。电视节目数据库的实验结果表明,通过所提出的方法可以实现高正确的检测率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号