首页> 外文期刊>IEEE Transactions on Circuits and Systems for Video Technology >Multi-Script-Oriented Text Detection and Recognition in Video/Scene/Born Digital Images
【24h】

Multi-Script-Oriented Text Detection and Recognition in Video/Scene/Born Digital Images

机译:视频/场景/出生数字图像中面向多脚本的文本检测和识别

获取原文
获取原文并翻译 | 示例
       

摘要

Achieving good text detection and recognition results for multi-script-oriented images is a challenging task. First, we explore bit plane slicing in order to utilize the advantage of the most significant bit information to identify text components. A new iterative nearest neighbor symmetry is then proposed based on shapes of convex and concave deficiencies of text components in bit planes to identify candidate planes. Further, we introduce a new concept called mutual nearest neighbor pair components based on gradient direction to identify representative pairs of texts in each candidate bit plane. The representative pairs are used to restore words with the help of edge image of the input one, which results in text detection results (words). Second, we propose a new idea by fixing window for character components of arbitrary oriented words based on angular relationship between sub-bands and a fused band. For each window, we extract features in contourlet wavelet domain to detect characters with the help of an SVM classifier. Further, we propose to explore HMM for recognizing characters and words of any orientation using the same feature vector. The proposed method is evaluated on standard databases such as ICDAR, YVT video, ICDAR, SVT, MSRA scene data, ICDAR born digital data, and multi-lingual data to show its superiority to the state of the art methods.
机译:为面向多脚本的图像实现良好的文本检测和识别结果是一项艰巨的任务。首先,我们探索位平面切片,以便利用最重要的位信息的优势来识别文本成分。然后,基于位平面中文本分量的凹凸缺陷的形状,提出了一种新的迭代最近邻对称性,以识别候选平面。此外,我们引入了一个新概念,即基于梯度方向的相互最邻近对组件,以识别每个候选位平面中的代表性文本对。代表对用于借助输入图像的边缘图像还原单词,从而产生文本检测结果(单词)。其次,我们通过基于子带和融合带之间的角度关系固定面向任意方向的单词的字符窗口的窗口,提出了一种新的想法。对于每个窗口,我们在SVM分类器的帮助下提取Contourlet小波域中的特征以检测字符。此外,我们建议探索HMM,以便使用相同的特征向量识别任何方向的字符和单词。在标准数据库(如ICDAR,YVT视频,ICDAR,SVT,MSRA场景数据,ICDAR固有的数字数据和多语言数据)上对提出的方法进行了评估,以显示其相对于现有方法的优越性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号