首页> 外文期刊>Multimedia Tools and Applications >Text recognition in scene image and video frame using Color Channel selection
【24h】

Text recognition in scene image and video frame using Color Channel selection

机译:使用颜色通道选择在场景图像和视频帧中进行文本识别

获取原文
获取原文并翻译 | 示例

摘要

In recent years, recognition of text from natural scene image and video frame has got increased attention among the researchers due to its various complexities and challenges. Because of low resolution, blurring effect, complex background, different fonts, color and variant alignment of text within images and video frames, etc., text recognition in such scenario is difficult. Most of the current approaches usually apply a binarization algorithm to convert them into binary images and next OCR is applied to get the recognition result. In this paper, we present a novel approach based on color channel selection for text recognition from scene images and video frames. In the approach, at first, a color channel is automatically selected and then selected color channel is considered for text recognition. Our text recognition framework is based on Hidden Markov Model (HMM) which uses Pyramidal Histogram of Oriented Gradient features extracted from selected color channel. From each sliding window of a color channel our color-channel selection approach analyzes the image properties from the sliding window and then a multi-label Support Vector Machine (SVM) classifier is applied to select the color channel that will provide the best recognition results in the sliding window. This color channel selection for each sliding window has been found to be more fruitful than considering a single color channel for the whole word image. Five different features have been analyzed for multi-label SVM based color channel selection where wavelet transform based feature outperforms others. Our framework of color channel selection is script-independent. It has been tested in English (Roman) and Devanagari (Indic) scripts. We have tested our approach on English datasets (ICDAR 2003, ICDAR 2013, MSRA-TD500, IIIT5K, SVT, YVT) publicly available for both video and scene images. For Devanagari script, we collected our own dataset. The performances obtained from experimental results are encouraging and show the advantage of the proposed method.
机译:近年来,由于自然场景图像和视频帧中的文本识别的各种复杂性和挑战,引起了研究人员的越来越多的关注。由于低分辨率,模糊效果,复杂的背景,不同的字体,图像和视频帧中文本的颜色和变体对齐等,因此在这种情况下难以识别文本。当前大多数方法通常采用二值化算法将其转换为二进制图像,然后应用下一个OCR来获得识别结果。在本文中,我们提出了一种基于颜色通道选择的新颖方法,用于从场景图像和视频帧中识别文本。在该方法中,首先,自动选择颜色通道,然后考虑选择的颜色通道用于文本识别。我们的文本识别框架基于隐马尔可夫模型(HMM),该模型使用从选定颜色通道提取的定向渐变特征的金字塔形直方图。从颜色通道的每个滑动窗口中,我们的颜色通道选择方法会分析滑动窗口中的图像属性,然后使用多标签支持向量机(SVM)分类器选择将在以下情况下提供最佳识别结果的颜色通道:滑动窗口。已经发现,与为整个单词图像考虑单个颜色通道相比,每个滑动窗口的颜色通道选择更加有效。已针对基于多标签SVM的颜色通道选择分析了五个不同的功能,其中基于小波变换的功能胜过其他功能。我们的颜色通道选择框架与脚本无关。它已通过英语(罗马)和梵文(印度)脚本进行了测试。我们已经在可公开用于视频和场景图像的英语数据集(ICDAR 2003,ICDAR 2013,MSRA-TD500,IIIT5K,SVT,YVT)上测试了我们的方法。对于梵文脚本,我们收集了自己的数据集。从实验结果获得的性能令人鼓舞,并表明了该方法的优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号