A Novel Audio-Oriented Learning Strategies for Character Recognition

机译：一种新颖的面向音频的字符识别学习策略

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we propose a robust audio-oriented learning strategies to address the issue of character recognition in movie/TV-series. Identifying major characters in movies/TV-series has drawn researcher's great interests. Most of them have explored some character recognition and retrieval applications based on visual appearance, whereas visual appearance is inconsistent throughout the whole video. Our approach, mainly focusing on audio, features that: (i) we extract both spectral and temporal audio features of Mel-scale Frequency Cepstral Coefficients(MFCC), prosodic, average pause length, speaking rate features, pitch and short time energy, and also the complementarity of Gabor features, (ii) we adopt Multi-Task Joint Sparse Representation and Recognition (MTJSRC) model for learning with all the features except Gabor, and SVM model with Gabor features, (iii) regarding these original features as seeds, we extend the training set from talk shows with semi-supervise learning, (iv) the Conditional Random Field (CRF) model with consideration of the constrains in time sequence is introduced to enhance the final labelling. Finally, experimental results demonstrates the effectiveness performance of our approach.

机译：在本文中，我们提出了一种强大的面向音频的学习策略，以解决电影/电视剧中的字符识别问题。识别电影/电视剧中的主要人物引起了研究者的极大兴趣。他们中的大多数人已经基于视觉外观探索了一些字符识别和检索应用程序，而视觉外观在整个视频中并不一致。我们的方法主要集中在音频上，其特点是：（i）提取梅尔级频率倒谱系数（MFCC）的频谱和时间音频特征，韵律，平均停顿长度，语速特征，音调和短时能量，以及以及Gabor功能的互补性;（ii）我们采用多任务联合稀疏表示和识别（MTJSRC）模型来学习具有除Gabor之外的所有功能，以及具有Gabor功能的SVM模型，（iii）将这些原始特征视为种子，我们通过半监督学习从脱口秀节目中扩展训练集，（iv）考虑时间顺序约束的条件随机场（CRF）模型被引入以增强最终标记。最后，实验结果证明了我们方法的有效性。

著录项

来源
《International Conference on Virtual Reality and Visualization》|2016年|459-464|共6页
会议地点
作者
Changbin Lu; Guangyu Gao;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Character recognition; Mel frequency cepstral coefficient; Feature extraction; Motion pictures; TV; Training; Support vector machines;

机译：字符识别;梅尔频率倒谱系数;特征提取;电影;电视;训练;支持向量机;

相似文献

外文文献
中文文献
专利

1. Real-time Automated Detection and Recognition of Nigerian License Plates via Deep Learning Single Shot Detection and Optical Character Recognition [J] . Kayode David Adedayo, Ayomide Oluwaseyi Agunloye Computer and Information Science . 2021,第4期

机译：通过深度学习单次检测和光学字符识别，实时自动检测和识别尼日利亚牌照
2. Handwritten Urdu character recognition via images using different machine learning and deep learning techniques [J] . M Ameen Chhajro, Hadeeb Khan, Farrukh Khan, Indian Journal of Science and Technology . 2020,第17期

机译：手写Urdu字符识别通过使用不同的机器学习和深度学习技术
3. Learning representation hierarchies by sharing visual features: a computational investigation of Persian character recognition with unsupervised deep learning [J] . Sadeghi Zahra, Testolin Alberto Cognitive processing . 2017,第3期

机译：通过分享视觉特征学习代表层次结构：与无监督深度学习的波斯字符识别的计算调查
4. A Novel Audio-Oriented Learning Strategies for Character Recognition [C] . Changbin Lu, Guangyu Gao International Conference on Virtual Reality and Visualization . 2016

机译：一种用于字符识别的小说面向音频学习策略
5. Learning Chinese characters: A comparative study of the learning strategies of students whose native language is alphabet-based and students whose native language is character-based. [D] . Arrow, Ju-Chuan. 2004

机译：学习汉字：对以字母为母语的学生和以字符为母语的学生的学习策略的比较研究。
6. Character Recognition of Components Mounted on Printed Circuit Board Using Deep Learning [O] . Sumyung Gang, Ndayishimiye Fabrice, Daewon Chung, 2021

机译：使用深度学习安装在印刷电路板上的部件的字符识别
7. Relationship between the Recognition of Chinese Characters and Understanding Classical Chinese Literature, and Strategies for Learning Chinese Characters [O] . 藤本陽子 2016

机译：汉字识别与中国古典文学理解与汉字学习策略的关系
8. Learning Algorithms for Multi-Class Pattern Classification and Problems Associated with on-Line Handwritten Character Recognition [R] . Li, C. C., Teng, T. L. 1970

机译：多类模式分类的学习算法及与在线手写字符识别相关的问题

A Novel Audio-Oriented Learning Strategies for Character Recognition

摘要

著录项

相似文献

相关主题

期刊订阅