首页> 外文学位 >Stereo active vision and peripheral optical flow: Computer vision applications of the wide-field human visual representation.
【24h】

Stereo active vision and peripheral optical flow: Computer vision applications of the wide-field human visual representation.

机译:立体主动视觉和外围光流:宽视野人类视觉表示的计算机视觉应用。

获取原文
获取原文并翻译 | 示例

摘要

The topographic structure of the central 20° of the primate visual field is approximated by a two-dimensional (complex variable) logarithmic function, a monopole map. Recent work shows that a second logarithmic pole in the far peripheral field provides a wide-angle model of visual representation. This dipole map is in good agreement with physiological data from the foveal to the far peripheral representation. The foveal and peripheral logarithmic poles have natural roles to play in object discrimination and egomotion, respectively. The first part of this work applies the foveal and parafoveal representation to the problem of designing a real-time stereo active vision system. A stereo “robotic head” was constructed with two cameras and actuators providing pan, tilt, and vergence. A set of attentional operators was developed for this system, which utilizes cues based on motion, color, depth, and shape, first for conventional digital images and then for monopole-mapped images. These attentional operators are demonstrated for the task of locating human faces in live video imagery. Next, optical flow near the peripheral logarithmic pole is used to construct robust navigational cues to sensor velocity. Most optical flow algorithms attempt to locate both the focus of expansion (FOE) and the axis of rotation (AOR) of the flow field, in order to estimate sensor heading and velocity. This approach generally fails to work in practice because it is highly sensitive to noise, motion distractors, and sensor platform jitter. However, this thesis shows that the peripheral flow field allows a robust extraction of sensor velocity without detailed knowledge of the FOE or AOR, and in the presence of significant unknown motion distractors. This thesis represents the first algorithms that directly exploit the wide-angle complex dipole model of human visual anatomy in computer vision applications.
机译:灵长类动物视野中心20°的地形结构由二维(复变量)对数函数(单极子图)近似。最近的工作表明,在远场中的第二个对数极点提供了可视表示的广角模型。该偶极子图与从中央凹到远端周边的生理数据非常吻合。中央凹对数极和周围对数极分别在物体辨别和自我运动中发挥自然作用。这项工作的第一部分将中央凹和中央凹表示应用于设计实时立体主动视觉系统的问题。立体声“机器人头”由两个摄像机和执行器组成,可提供摇摄,倾斜和发散。为此系统开发了一组注意运算符,该运算符利用基于运动,颜色,深度和形状的提示,首先针对常规数字图像,然后针对单极映射图像。这些关注的操作员被演示用于在实时视频图像中定位人脸的任务。接下来,外围对数极点附近的光流被用来构建对传感器速度的鲁棒导航提示。大多数光学流算法都试图同时定位流场的扩展焦点(FOE)和旋转轴(AOR),以估计传感器的航向和速度。这种方法通常在实践中行不通,因为它对噪声,运动干扰物和传感器平台抖动非常敏感。然而,本论文表明,在没有详细了解FOE或AOR且存在大量未知运动干扰因素的情况下,外围流场能够可靠地提取传感器速度。本论文代表了在计算机视觉应用中直接利用人类视觉解剖学的广角复杂偶极子模型的第一个算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号