首页> 外文会议>Innovations in Power and Advanced Computing Technologies >A deep learning based character recognition system from multimedia document
【24h】

A deep learning based character recognition system from multimedia document

机译:一种基于深度学习的多媒体文档字符识别系统

获取原文

摘要

Text recognition from natural scene images is very tough task now these days compare than videos. Application of image processing called pattern recognition make easy to recognize text from multimedia documents. A pattern can be fingerprint image, handwritten word sample, human face images, speech signal and DNA sequence etc or we can say that all pattern are in machine editable form. Text can be recognized with and without segmentation of character. Segmentation can be line, word or character level and without segmentation character is recognized from whole text image. Character recognition is a field of research and various research has been done in the area of pattern recognition. There we use a new technique called diagonal based feature extraction in last layer of convolutional neural network and make feature extraction easy with the help of genetic algorithm. After extraction of feature we provide training to extreme learning machine. Along this feature extraction technique we use feed forward network as a classifier and convolution neural network for feature extractor. It is a deep learning based technique of neural network which use for classification or recognition of text. This is basically used for providing training and in testing phase. CRConvNet has more layers working of all layer shown in flowchart. One dataset which contain 360 training set data that are all in capital(A-Z) and small(a-z) alphabet, digit (0-9) and some special character are also used. Another dataset contains samples of video and images (ICDAR 2003) for testing. Extensive studies shows that the recognition system which using diagonal based feature learning provide high recognition accuracy while requiring less time for training.
机译:现在,与视频相比,如今自然场景图像的文本识别是一项艰巨的任务。称为模式识别的图像处理应用程序使从多媒体文档中识别文本变得容易。模式可以是指纹图像,手写文字样本,人脸图像,语音信号和DNA序列等,也可以说所有模式都是机器可编辑的形式。可以识别带有或不带有字符分割的文本。分割可以是行,单词或字符级别,没有分割的字符可以从整个文本图像中识别出来。字符识别是一个研究领域,在模式识别领域已经进行了各种研究。在那里,我们在卷积神经网络的最后一层使用了一种称为对角线特征提取的新技术,借助遗传算法使特征提取变得容易。提取特征后,我们将为极限学习机提供培训。沿着这种特征提取技术,我们使用前馈网络作为分类器,并使用卷积神经网络进行特征提取。这是一种基于深度学习的神经网络技术,用于文本的分类或识别。这基本上用于提供培训和测试阶段。 CRConvNet在流程图中显示的所有层中都有更多的层工作。还使用了一个包含360个训练集数据的数据集,这些数据全部使用大写(A-Z)和小写(a-z)字母,数字(0-9)和某些特殊字符。另一个数据集包含用于测试的视频和图像样本(ICDAR 2003)。大量研究表明,使用基于对角线特征学习的识别系统可提供较高的识别精度,同时所需的训练时间更少。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号