首页> 外文学位 >Use of Projector-Camera System for Human-Computer Interaction.
【24h】

Use of Projector-Camera System for Human-Computer Interaction.

机译:使用投影机-摄像机系统进行人机交互。

获取原文
获取原文并翻译 | 示例

摘要

The use of a projector in place of traditional display device would dissociate display size from device size, making portability much less an issue. Associated with camera, the projector-camera system allows simultaneous video display and 3D acquisition through imperceptible structured light sensing, providing a vivid and immersed platform for natural human-computer interaction. Key issues involved in the approach include: (1) Simultaneous Display and Acquisition: how to make normal video projector not only a display device but also a 3D sensor even with the prerequisite of incurring minimum disturbance to the original projection; (2) 3D Information Interpretation: how to interpret the spare depth information with the assistance of some additional cues to enhance the system performance; (3) Segmentation: how to acquire accurate segmentation in the presence of the incessant variation of the projected video content; (4) Posture Recognition: how to infer 3D posture from single image. This thesis aims at providing improved solutions to each of these issues.;To address the conflict between imperceptibility of the embedded codes and the robustness of code retrieval, noise-tolerant schemes to both the coding and decoding stages are introduced. At the coding end, specifically designed primitive shapes and large Hamming distance are employed to enhance tolerance toward noise. At the decoding end, pre-trained primitive shape detectors are used to detect and identify the embedded codes -- a task difficult to achieve by segmentation that is used in general structured light methods, for the weakly embedded information is generally interfered by substantial noise.;On 3D information interpretation, a system that estimates 6-DOF head pose by imperceptible structured light sensing is proposed. First, through elaborate pattern projection strategy and camera-projector synchronization, pattern-illuminated images and the corresponding scene-texture image are captured with imperceptible patterned illumination. Then, 3D positions of the key facial feature points are derived by a combination of the 2D facial feature points in the scene-texture image localized by AAM and the point cloud generated by structured light sensing. Eventually, the head orientation and translation are estimated by SVD of a correlation matrix that is generated from the 3D corresponding feature point pairs over different frames.;On the segmentation issue, we describe a coarse-to-fine hand segmentation method for projector-camera system. After rough segmentation by contrast saliency detection and mean shift-based discontinuity-preserved smoothing, the refined result is confirmed through confidence evaluation.;Finally, we address how an HCI (Human-Computer Interface) with small device size, large display, and touch input facility can be made possible by a mere projector and camera. The realization is through the use of a properly embedded structured light sensing scheme that enables a regular light-colored table surface to serve the dual roles of both a projection screen and a touch-sensitive display surface. A random binary pattern is employed to code structured light in pixel accuracy, which is embedded into the regular projection display in a way that the user perceives only regular display but not the structured pattern hidden in the display. With the projection display on the table surface being imaged by a camera, the observed image data, plus the known projection content, can work together to probe the 3D world immediately above the table surface, like deciding if there is a finger present and if the finger touches the table surface, and if so at what position on the table surface the finger tip makes the contact. All the decisions hinge upon a careful calibration of the projector-camera-table surface system, intelligent segmentation of the hand in the image data, and exploitation of the homography mapping existing between the projector's display panel and the camera's image plane.
机译:使用投影仪代替传统的显示设备会使显示尺寸与设备尺寸分离,从而使可移植性问题大大减少。投影机-摄像机系统与摄像机相关联,可通过不易察觉的结构化光感应同时进行视频显示和3D采集,从而为自然的人机交互提供了生动逼真的平台。该方法涉及的关键问题包括:(1)同时显示和获取:即使在对原始投影造成最小干扰的前提下,如何不仅使普通视频投影仪不仅使显示设备而且使3D传感器成为可能; (2)3D信息解释:如何在一些其他提示的帮助下解释备用深度信息,以提高系统性能; (3)分割:在投影视频内容不断变化的情况下如何获得准确的分割; (4)姿势识别:如何从单个图像中推断3D姿势。本文旨在为每个问题提供改进的解决方案。为了解决嵌入代码的不可感知性与代码检索的鲁棒性之间的冲突,引入了针对编码和解码阶段的容忍方案。在编码端,采用了专门设计的原始形状和较大的汉明距离以增强对噪声的容忍度。在解码端,预训练的原始形状检测器用于检测和识别嵌入的代码-由于结构较弱的嵌入信息通常会受到大量噪声的干扰,这是一般结构光方法中难以通过分段实现的任务。 ;在3D信息解释中,提出了一种通过不易察觉的结构化光感测来估计6自由度头部姿态的系统。首先,通过精心设计的图案投影策略和相机-投影仪同步,以不可见的图案照明来捕获图案照明的图像和相应的场景纹理图像。然后,通过由AAM定位的场景纹理图像中的2D面部特征点和由结构化光感测生成的点云的组合,得出关键面部特征点的3D位置。最终,通过3D对应特征点对在不同帧上生成的相关矩阵的SVD估计头部的方向和平移。在分割问题上,我们描述了一种从细到细的投影仪相机手部分割方法系统。在通过对比度显着性检测和基于均值偏移的不连续性平滑进行粗略分割之后,通过置信度评估确认了改进的结果。最后,我们解决了如何以小尺寸,大尺寸显示和触摸显示的人机界面仅通过投影仪和照相机就可以实现输入功能。该实现是通过使用适当嵌入的结构化光感测方案实现的,该方案使常规的浅色桌子表面能够同时充当投影屏幕和触敏显示表面的双重角色。采用随机二进制模式以像素精度对结构化光进行编码,以用户仅感知常规显示而不感知隐藏在显示器中的结构化模式的方式将其嵌入到常规投影显示器中。借助照相机对桌子表面上的投影显示进行摄像,观察到的图像数据以及已知的投影内容可以一起工作,以探测桌子表面正上方的3D世界,例如确定是否存在手指以及是否手指手指触摸桌子表面,如果这样,则指尖在桌子表面上的哪个位置进行接触。所有决定都取决于对投影仪-相机-桌面系统的仔细校准,图像数据中手的智能分割以及对在投影仪的显示面板和相机的图像平面之间存在的单应性映射的利用。

著录项

  • 作者

    Dai, Jingwen.;

  • 作者单位

    The Chinese University of Hong Kong (Hong Kong).;

  • 授予单位 The Chinese University of Hong Kong (Hong Kong).;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2012
  • 页码 201 p.
  • 总页数 201
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号