Scene text detection using enhanced Extremal region and convolutional neural network

Fatemeh Naiemi; Vahid Ghods; Hassan Khalesi

首页> 外文期刊>Multimedia Tools and Applications >Scene text detection using enhanced Extremal region and convolutional neural network

【24h】

Scene text detection using enhanced Extremal region and convolutional neural network

机译：使用增强的极值区域和卷积神经网络的场景文本检测

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Text in scene images usually contains significant information. Text detection and recognition in the scene is important for a variety of advanced machine vision applications, such as image and video retrieval, automotive assistance, and multilingual translation. In particular, most text recognition systems require texts to be localized in images beforehand and this is a significant demand. The purpose of this study is to provide a method to detect texts in natural images. The proposed approach combines advantages of extremal region, ER, methods and classification of convolutional neural network, CNN. This significantly reduces the false positives and increases the accuracy of detection. The method of sliding windows is employed with different sizes in order to determine text candidates. Extraction of enhanced ERs is performed in three consecutive stages on three distinct color channels, R, G, and B. Then, the results are combined together by an add method. After grouping, the word candidates are classified to two classes of text and nontext sections by a CNN classifier. By applying non-maximum suppression (NMS) algorithm to the same words, words with the highest probability are selected. The average values of accuracy, recall, precision and F-measure of the proposed text detection model on the ICDAR2013 database are 0.893, 0.962, 0.948, and 0.955, respectively. The optimal cut point of the proposed method is 0.648, which has the highest average accuracy, 91.93%. The AUC of ROC and PR diagrams for the proposed model are 0.851 and 0.718, respectively. These results of AUC for ROC and PR curves showed an outstanding enhancement in comparison with the best detection rate of previous methods. Experimental results on the ICDAR2011, ICDAR2013 and ICDAR2015 databases also demonstrate that our algorithm outperforms the state-of-the-art scene text detection methods.

机译：场景图像中的文本通常包含重要信息。场景中的文本检测和识别对于各种先进的机器视觉应用是重要的，例如图像和视频检索，汽车辅助和多语言翻译。特别是，大多数文本识别系统事先要求文本在图像中本地化，这是一个重要的需求。本研究的目的是提供一种检测自然图像文本的方法。该方法结合了卷积神经网络的极值区域，ER，方法和分类的优点。这显着降低了误报并提高了检测的准确性。滑动窗口的方法采用不同的尺寸，以便确定文本候选。在三个不同的颜色通道，R，G和B中，在三个连续阶段中进行增强的增强剂的提取。然后，结果通过添加方法组合在一起。分组后，候选词被CNN分类器分类为两类文本和非文本部分。通过将非最大抑制（NMS）算法应用于相同的单词，选择具有最高概率的单词。 ICDAR2013数据库上所提出的文本检测模型的准确度，召回，精度和F测量的平均值分别为0.893,0.962,0.948和0.955。所提出的方法的最佳切割点为0.648，具有最高的平均精度，91.93％。拟议模型的ROC和PR图的AUC分别为0.851和0.718。 ROC和PR曲线AUC的这些结果表明，与先前方法的最佳检测率相比，具有出色的增强。 ICDAR2011，ICDAR2013和ICDAR2015数据库的实验结果还证明了我们的算法优于最先进的场景文本检测方法。

著录项

来源
《Multimedia Tools and Applications》 |2020年第38期|27137-27159|共23页
作者
Fatemeh Naiemi; Vahid Ghods; Hassan Khalesi;
展开▼
作者单位

Department of Electrical and Computer Engineering Semnan Branch Islamic Azad University Semnan Iran;

Department of Electrical and Computer Engineering Semnan Branch Islamic Azad University Semnan Iran;

Department of Electrical and Computer Engineering Garmsar Branch Islamic Azad University Garmsar Iran;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Scene text detection; Extremal region; ER method; CNN; Natural image;

机译：场景文本检测;极值区域;呃方法;CNN;自然形象;

相似文献

外文文献
中文文献
专利

1. Text-Attentional Convolutional Neural Network for Scene Text Detection [J] . T. He, W. Huang, Y. Qiao, IEEE Transactions on Image Processing . 2016,第6期

机译：文本注意卷积神经网络的场景文本检测
2. Real-time Arabic scene text detection using fully convolutional neural networks [J] . Rajae Moumen, Raddouane Chiheb, Rdouan Faizi International Journal of Electrical and Computer Engineering . 2021,第2期

机译：使用完全卷积神经网络的实时阿拉伯语场景文本检测
3. Scene text detection with fully convolutional neural networks [J] . Liu Zhandong, Zhou Wengang, Li Houqiang Multimedia Tools and Applications . 2019,第13期

机译：全卷积神经网络的场景文本检测
4. Robust Text Detection in Natural Scene Images by Generalized Color-Enhanced Contrasting Extremal Region and Neural Networks [C] . Sun Lei, Huo Qiang, Jia Wei, International Conference on Pattern Recognition . 2014

机译：广义色彩增强的对比度极值区域和神经网络在自然场景图像中的鲁棒文本检测
5. Deep Neural Language Model for Text Classification Based on Convolutional and Recurrent Neural Networks [D] . Hassan, Abdalraouf. 2018

机译：基于卷积神经网络和递归神经网络的深度神经语言文本分类模型
6. Semi-supervised Convolutional Neural Networks for Text Categorization via Region Embedding [O] . Rie Johnson, Tong Zhang -1

机译：基于区域嵌入的半监督卷积神经网络文本分类
7. Scene text detection via extremal region based double threshold convolutional network classification. [O] . Wei Zhu, Jing Lou, Longtao Chen, 2017

机译：基于极值区域的双阈值卷积网络分类场景文本检测。
8. Keypoint Density-Based Region Proposal for Fine-Grained Object Detection and Classification Using Regions with Convolutional Neural Network Features. [R] . Turner, J. T., Gupta, K., Morris, B., 2015

机译：基于关键点密度的区域提议，用于使用具有卷积神经网络特征的区域进行细粒度目标检测和分类。

Scene text detection using enhanced Extremal region and convolutional neural network

摘要

著录项

相似文献

相关主题

期刊订阅