A Novel Audio-Oriented Learning Strategies for Character Recognition

机译：一种用于字符识别的小说面向音频学习策略

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we propose a robust audio-oriented learning strategies to address the issue of character recognition in movie/TV-series. Identifying major characters in movies/TV-series has drawn researcher's great interests. Most of them have explored some character recognition and retrieval applications based on visual appearance, whereas visual appearance is inconsistent throughout the whole video. Our approach, mainly focusing on audio, features that: (i) we extract both spectral and temporal audio features of Mel-scale Frequency Cepstral Coefficients(MFCC), prosodic, average pause length, speaking rate features, pitch and short time energy, and also the complementarity of Gabor features, (ii) we adopt Multi-Task Joint Sparse Representation and Recognition (MTJSRC) model for learning with all the features except Gabor, and SVM model with Gabor features, (iii) regarding these original features as seeds, we extend the training set from talk shows with semi-supervise learning, (iv) the Conditional Random Field (CRF) model with consideration of the constrains in time sequence is introduced to enhance the final labelling. Finally, experimental results demonstrates the effectiveness performance of our approach.

机译：在本文中，我们提出了一个强大的音频为导向的学习策略，以解决字符识别的问题，在电影/电视系列。识别电影主要角色/ TV系列已引起研究者的极大兴趣。他们中的大多数已经探索了一些字符识别和基于视觉外观检索应用，而外观是在整个视频不一致。我们的做法，主要集中在音频，特色是：（i）我们提取梅尔频率倒谱系数（MFCC），韵律，平均停顿长，语速功能，音调和短时能量的光谱和时间音频功能，并同样的Gabor的互补特性，（二），我们采用多任务联合稀疏表示与识别（MTJSRC）模型与除的Gabor所有功能学习，SVM模型Gabor特征，（iii）关于这些原有特色种子，我们从脱口秀与扩展训练集半监督学习，（iv）与考虑按照时间顺序约束的条件随机场（CRF）模型引入以增强最终的标签标识。最后，实验结果证明了我们方法的有效性表现。

著录项

来源
《International Conference on Virtual Reality and Visualization》|2016年|1 v.|共6页
会议地点
作者
Changbin Lu; Guangyu Gao;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算机仿真;
关键词
Character recognition; Mel frequency cepstral coefficient; Feature extraction; Motion pictures; TV; Training; Support vector machines;

机译：字符识别;MEL频率抗肌肌系数;特征提取;电影;电视;培训;支持矢量机器;

相似文献

外文文献
中文文献
专利

1. Real-time Automated Detection and Recognition of Nigerian License Plates via Deep Learning Single Shot Detection and Optical Character Recognition [J] . Kayode David Adedayo, Ayomide Oluwaseyi Agunloye Computer and Information Science . 2021,第4期

机译：通过深度学习单次检测和光学字符识别，实时自动检测和识别尼日利亚牌照
2. Handwritten Urdu character recognition via images using different machine learning and deep learning techniques [J] . M Ameen Chhajro, Hadeeb Khan, Farrukh Khan, Indian Journal of Science and Technology . 2020,第17期

机译：手写Urdu字符识别通过使用不同的机器学习和深度学习技术
3. Learning representation hierarchies by sharing visual features: a computational investigation of Persian character recognition with unsupervised deep learning [J] . Sadeghi Zahra, Testolin Alberto Cognitive processing . 2017,第3期

机译：通过分享视觉特征学习代表层次结构：与无监督深度学习的波斯字符识别的计算调查
4. A Novel Audio-Oriented Learning Strategies for Character Recognition [C] . Changbin Lu, Guangyu Gao International Conference on Virtual Reality and Visualization . 2016

机译：一种新颖的面向音频的字符识别学习策略
5. Learning Chinese characters: A comparative study of the learning strategies of students whose native language is alphabet-based and students whose native language is character-based. [D] . Arrow, Ju-Chuan. 2004

机译：学习汉字：对以字母为母语的学生和以字符为母语的学生的学习策略的比较研究。
6. Character Recognition of Components Mounted on Printed Circuit Board Using Deep Learning [O] . Sumyung Gang, Ndayishimiye Fabrice, Daewon Chung, 2021

机译：使用深度学习安装在印刷电路板上的部件的字符识别
7. Relationship between the Recognition of Chinese Characters and Understanding Classical Chinese Literature, and Strategies for Learning Chinese Characters [O] . 藤本陽子 2016

机译：汉字识别与中国古典文学理解与汉字学习策略的关系
8. Learning Algorithms for Multi-Class Pattern Classification and Problems Associated with on-Line Handwritten Character Recognition [R] . Li, C. C., Teng, T. L. 1970

机译：多类模式分类的学习算法及与在线手写字符识别相关的问题

A Novel Audio-Oriented Learning Strategies for Character Recognition

摘要

著录项

相似文献

相关主题

期刊订阅