首页> 外文会议>Multimodal Technologies for Perception of Humans; Lecture Notes in Computer Science; 4122 >Person Identification Based on Multichannel and Multimodality Fusion
【24h】

Person Identification Based on Multichannel and Multimodality Fusion

机译:基于多渠道多模式融合的人员识别

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Person ID is a very useful information for high level video analysis and retrieval. In some scenario, the recording is not only multimodality and also multichannel(microphone array, camera array). In this paper, we describe a Multimodal person ID system base on multichannel and multimodal fusion. The audio only system is combining 7 channel microphone recording at decision output individual audio-only system. The modeling technique of audio system is Universal Background Model(UBM) and Maximum a Posterior adaptation framework which is very popular in speaker recognition literature. The visual only system works directly on the appearance space via l_1 norm and nearest neighbor classifier. The linear fusion is then combining the two modalities to improve the ID performance. The experiments indicate the effectiviness of micropohone array fusion and audio/visual fusion.
机译:人员ID是用于高级视频分析和检索的非常有用的信息。在某些情况下,录制不仅是多模式的,而且是多通道的(麦克风阵列,相机阵列)。在本文中,我们描述了一种基于多渠道和多模式融合的多模式人员ID系统。纯音频系统在决策输出单独纯音频系统上结合了7通道麦克风录音。音频系统的建模技术是通用背景模型(UBM)和最大后验自适应框架,这在说话者识别文献中非常流行。仅视觉系统通过l_1范数和最近邻居分类器直接在外观空间上工作。然后,线性融合将两种模态结合起来以改善ID性能。实验表明微带孔阵列融合和视听融合的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号