首页> 外文期刊>ACM transactions on accessible computing >Sign Transition Modeling and a Scalable Solution to Continuous Sign Language Recognition for Real-World Applications
【24h】

Sign Transition Modeling and a Scalable Solution to Continuous Sign Language Recognition for Real-World Applications

机译:现实应用中的符号过渡建模和连续手势识别的可扩展解决方案

获取原文
获取原文并翻译 | 示例

摘要

We propose a new approach to modeling transition information between signs in continuous Sign Language Recognition (SLR) and address some scalability issues in designing SLR systems. In contrast to Automatic Speech Recognition (ASR) in which the transition between speech sounds is often brief and mainly addressed by the coarticulation effect, the sign transition in continuous SLR is far from being clear and usually not easily and exactly characterized. Leveraging upon hidden Markov modeling techniques from ASR, we proposed a modeling framework for continuous SLR having the following major advantages, namely: (ⅰ) the system is easy to scale up to large-vocabulary SLR; (ⅱ) modeling of signs as well as the transitions between signs is robust even for noisy data collected in real-world SLR; and (ⅲ) extensions to training, decoding, and adaptation are directly applicable even with new deep learning algorithms. A pair of low-cost digital gloves affordable for the deaf and hard of hearing community is used to collect a collection of training and testing data for real-world SLR interaction applications. Evaluated on 1,024 testing sentences from five signers, a word accuracy rate of 87.4% is achieved using a vocabulary of 510 words. The SLR speed is in real time, requiring an average of 0.69s per sentence. The encouraging results indicate that it is feasible to develop real-world SLR applications based on the proposed SLR framework.
机译:我们提出了一种在连续手语识别(SLR)中对符号之间的过渡信息建模的新方法,并解决了设计SLR系统时的一些可伸缩性问题。与自动语音识别(ASR)相比,语音之间的过渡通常是短暂的并且主要通过共发音效应来解决,而连续SLR中的符号过渡远不是很清晰,而且通常不容易准确地表征。借助ASR的隐马尔可夫建模技术,我们提出了一种用于连续SLR的建模框架,该框架具有以下主要优点:(ⅰ)该系统易于扩展到大词汇量的SLR; (ⅱ)即使对于在现实世界中的单反相机中收集的嘈杂数据,信号建模以及信号之间的过渡也很健壮; (ⅲ)训练,解码和自适应的扩展甚至可以使用新的深度学习算法直接应用。一副为聋哑人和听力障碍者买得起的低成本数码手套,用于收集现实世界中SLR交互应用程序的训练和测试数据。对来自五个签名者的1,024个测试句子进行评估,使用510个单词的词汇量,单词的准确率达到87.4%。 SLR速度是实时的,每个句子平均需要0.69s。令人鼓舞的结果表明,基于建议的SLR框架开发实际的SLR应用程序是可行的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号