【24h】

Syllable-Based Indonesian Lip Reading Model

机译:基于音节的印尼唇读模型

获取原文

摘要

Lip reading is a communication method by reading the lips movement of a speaker. It is also called visual speech recognition, which converts a video into a text. The text is consisting of some words or even sentences spoken by the speakers. One of the challenges often encountered in a lip reading is the high variances of inputs. The variances, like facial features and different speed of speech, can decrease the accuracy. Nowadays, deep learning provides promising results in extracting visual features. In order to be able to use a video as the input, a 3D Deep Learning architecture is exploited. Besides, the out-of-vocabulary (OOV) problem also makes the visual speech recognition system harder to apply in the real world. It can only predict the words appear in the dictionary. However, the vocabulary continues to grow each year, especially in the Indonesian language. It is hard to fit all possible words into the system. Hence, a syllable-based model is proposed in this research to handle such a problem. The syllable-based model gives a chance to build a new word that does not appear in the dictionary. The combination of the existing syllable is used to construct a new word. Since the data obtained too small for deep learning, the augmentation process is performed 40 times. Evaluation using the augmented data, the proposed model reaches a high accuracy of 100% for the testing set. An examination using ten OOV words informs that the developed model gives a lower accuracy of 80%.
机译:嘴唇朗读是一种通过读取扬声器的嘴唇运动来传达的方法。它也被称为视觉语音识别,它将视频转换为文本。文本由说话者说的一些单词甚至句子组成。口头阅读中经常遇到的挑战之一是输入的高方差。诸如面部特征和不同语音速度之类的差异会降低准确性。如今,深度学习在提取视觉特征方面提供了令人鼓舞的结果。为了能够将视频用作输入,我们利用了3D深度学习架构。此外,语音不足(OOV)问题也使视觉语音识别系统更难以在现实世界中应用。它只能预测单词出现在词典中。但是,词汇量每年都在增长,尤其是印尼语。很难将所有可能的单词都放入系统中。因此,本研究提出了一种基于音节的模型来解决这一问题。基于音节的模型使您有机会构建一个未出现在词典中的新单词。现有音节的组合用于构造新单词。由于获得的数据太少,无法进行深度学习,因此执行增强过程40次。使用增强数据进行评估,所提出的模型对于测试集达到了100%的高精度。使用十个OOV词进行的检查表明,所开发的模型的准确率较低,为80%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号