首页> 外文学位 >Spatiotemporal gesture segmentation.
【24h】

Spatiotemporal gesture segmentation.

机译:时空手势分割。

获取原文
获取原文并翻译 | 示例

摘要

Spotting patterns of interest in an input signal is a very useful task in many different fields including medicine, bioinformatics, economics, speech recognition and computer vision. Example instances of this problem include spotting an object of interest in an image (e.g., a tumor), a pattern of interest in a time-varying signal (e.g., audio analysis), or an object of interest moving in a specific way (e.g., a human's body gesture). Traditional spotting methods, which are based on Dynamic Time Warping or hidden Markov models, use some variant of dynamic programming to register the pattern and the input while accounting for temporal variation between them. At the same time, those methods often suffer from several shortcomings: they may give meaningless solutions when input observations are unreliable or ambiguous, they require a high complexity search across the whole input signal, and they may give incorrect solutions if some patterns appear as smaller parts within other patterns. In this thesis, we develop a framework that addresses these three problems, and evaluate the framework's performance in spotting and recognizing hand gestures in video.; The first contribution is a spatiotemporal matching algorithm that extends the dynamic programming formulation to accommodate multiple candidate hand detections in every video frame. The algorithm finds the best alignment between the gesture model and the input, and simultaneously locates the best candidate hand detection in every frame. This allows for a gesture to be recognized even when the hand location is highly ambiguous.; The second contribution is a pruning method that uses model-specific classifiers to reject dynamic programming hypotheses with a poor match between the input and model. Pruning improves the efficiency of the spatiotemporal matching algorithm, and in some cases may improve the recognition accuracy. The pruning classifiers are learned from training data, and cross-validation is used to reduce the chance of overpruning.; The third contribution is a subgesture reasoning process that models the fact that some gesture models can falsely match parts of other, longer gestures. By integrating subgesture reasoning the spotting algorithm can avoid the premature detection of a subgesture when the longer gesture is actually being performed. Subgesture relations between pairs of gestures are automatically learned from training data.; The performance of the approach is evaluated on two challenging video datasets: hand-signed digits gestured by users wearing short sleeved shirts, in front of a cluttered background, and American Sign Language (ASL) utterances gestured by ASL native signers. The experiments demonstrate that the proposed method is more accurate and efficient than competing approaches. The proposed approach can be generally applied to alignment or search problems with multiple input observations, that use dynamic programming to find a solution.
机译:在许多不同领域,包括医学,生物信息学,经济学,语音识别和计算机视觉,在输入信号中发现感兴趣的模式是一项非常有用的任务。此问题的示例实例包括在图像中发现感兴趣的对象(例如,肿瘤),时变信号中的感兴趣的模式(例如,音频分析)或以特定方式移动的感兴趣的对象(例如, ,一个人的身体手势)。基于动态时间规整或隐马尔可夫模型的传统点样方法使用动态编程的某些变体来注册模式和输入,同时考虑它们之间的时间变化。同时,这些方法通常有几个缺点:当输入观测值不可靠或模棱两可时,它们可能会给出无意义的解决方案;它们需要对整个输入信号进行高复杂度的搜索;如果某些模式显得较小,它们可能会给出不正确的解决方案。其他样式中的零件。在本文中,我们开发了一个解决这三个问题的框架,并评估了该框架在发现和识别视频中的手势方面的性能。第一个贡献是时空匹配算法,该算法扩展了动态编程公式,以适应每个视频帧中的多个候选手部检测。该算法找到手势模型和输入之间的最佳对齐方式,并同时在每个帧中找到最佳候选手部检测。即使在手的位置非常模糊的情况下,也可以识别手势。第二个贡献是一种修剪方法,该方法使用特定于模型的分类器来拒绝输入与模型之间匹配差的动态编程假设。修剪可以提高时空匹配算法的效率,并且在某些情况下可以提高识别精度。修剪分类器是从训练数据中学习的,并且使用交叉验证来减少过度修剪的机会。第三个贡献是一个手势推理过程,该过程对以下事实进行建模:某些手势模型可以错误地匹配其他更长的手势的一部分。通过集成子手势推理,点点算法可以避免在实际执行较长手势时过早检测子手势。从训练数据中自动学习手势对之间的子姿势关系。该方法的性能在两个具有挑战性的视频数据集上进行了评估:由穿着短袖衬衫的用户在杂乱的背景前打手势的手势数字,以及ASL本地签名者打手势的美国手语(ASL)语音。实验表明,所提出的方法比竞争方法更准确,更有效。所提出的方法通常可以应用于具有多个输入观测值的对齐或搜索问题,这些观测值使用动态编程来查找解决方案。

著录项

  • 作者

    Alon, Jonathan.;

  • 作者单位

    Boston University.;

  • 授予单位 Boston University.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2006
  • 页码 144 p.
  • 总页数 144
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

  • 入库时间 2022-08-17 11:40:37

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号