首页> 外文学位 >Bayesian three-dimensional multiple people tracking using multiple indoor cameras and microphones.
【24h】

Bayesian three-dimensional multiple people tracking using multiple indoor cameras and microphones.

机译:贝叶斯三维多人跟踪使用多个室内摄像机和麦克风。

获取原文
获取原文并翻译 | 示例

摘要

This thesis represents Bayesian joint audio-visual tracking for the 3D locations of multiple people and a current speaker(s) in a real conference environment. To achieve this objective, it focuses on several different research interests, such as acoustic-feature detection, visual-feature detection, a tracking framework, data association, and sensor fusion. As acoustic-feature detection, time-delay-of-arrival (TDOA) estimation is used for the detection of multiple acoustic sources. Localization performance using TDOAs is also analyzed according to different configurations of microphones. As visual-feature detection, Viola-Jones face detection is used to initialize the locations of unknown multiple people. Then, motion detection using a corner feature, based on the results from the Viola-Jones face detection, is used to follow these non-rigid frontal faces/face profile/upper bodies in normal tracking mode. Simple point-to-line correspondences between multiple cameras using fundamental matrices are used to determine which features are more robust. As a method for data association and sensor fusion, Monte-Carlo JPDAF and a data association with IPPF (DA-IPPF) are implemented in the framework of particle filtering. The proposed algorithms and framework are applied to three different tracking scenarios of acoustic source tracking, visual source tracking, and joint acoustic-visual source tracking. Finally the implementation of this joint acoustic-visual tracking system using cameras and microphones is introduced in two parts of system implementation and real-time processing.
机译:本论文代表了在实际会议环境中对多个人和当前发言人的3D位置进行的贝叶斯联合视听跟踪。为了实现此目标,它专注于几个不同的研究兴趣,例如声学特征检测,视觉特征检测,跟踪框架,数据关联和传感器融合。作为声特征检测,到达时间延迟(TDOA)估计用于检测多个声源。还根据麦克风的不同配置分析了使用TDOA的定位性能。作为视觉特征检测,Viola-Jones面部检测用于初始化未知多个人的位置。然后,基于来自Viola-Jones面部检测的结果,使用拐角特征进行运动检测,以正常跟踪模式跟踪这些非刚性正面/面部轮廓/上身。使用基本矩阵在多个摄像机之间进行简单的点到线对应,可以确定哪些功能更可靠。作为数据关联和传感器融合的一种方法,在粒子过滤的框架中实现了蒙特卡洛JPDAF和与IPPF的数据关联(DA-IPPF)。所提出的算法和框架被应用于声源跟踪,视觉源跟踪和联合声-视觉源跟踪的三种不同的跟踪场景。最后,在系统实现和实时处理的两个部分中介绍了使用摄像头和麦克风的联合声像跟踪系统的实现。

著录项

  • 作者

    Lee, Yeongseon.;

  • 作者单位

    Georgia Institute of Technology.;

  • 授予单位 Georgia Institute of Technology.;
  • 学科 Engineering Electronics and Electrical.
  • 学位 Ph.D.
  • 年度 2009
  • 页码 154 p.
  • 总页数 154
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 无线电电子学、电信技术;
  • 关键词

  • 入库时间 2022-08-17 11:37:41

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号