首页> 外文会议>2015 IEEE China Summit amp; International Conference on Signal and Information Processing >A new unsupervised convolutional neural network model for Chinese scene text detection
【24h】

A new unsupervised convolutional neural network model for Chinese scene text detection

机译:一种新的中文场景文本检测无监督卷积神经网络模型

获取原文
获取原文并翻译 | 示例

摘要

As one of the most popular deep learning models, convolution neural network (CNN) has achieved huge success in image information extraction. Traditionally CNN is trained by supervised learning method with labeled data and used as a classifier by adding a classification layer in the end. Its capability of extracting image features is largely limited due to the difficulty of setting up a large training dataset. In this paper, we propose a new unsupervised learning CNN model, which uses a so-called convolutional sparse auto-encoder (CSAE) algorithm pre-train the CNN. Instead of using labeled natural images for CNN training, the CSAE algorithm can be used to train the CNN with unlabeled artificial images, which enables easy expansion of training data and unsupervised learning. The CSAE algorithm is especially designed for extracting complex features from specific objects such as Chinese characters. After the features of articficial images are extracted by the CSAE algorithm, the learned parameters are used to initialize the first CNN convolutional layer, and then the CNN model is fine-trained by scene image patches with a linear classifier. The new CNN model is applied to Chinese scene text detection and is evaluated with a multilingual image dataset, which labels Chinese, English and numerals texts separately. More than 10% detection precision gain is observed over two CNN models.
机译:作为最受欢迎的深度学习模型之一,卷积神经网络(CNN)在图像信息提取中取得了巨大的成功。传统上,CNN通过带标记数据的监督学习方法进行训练,并通过在最后添加分类层来用作分类器。由于难以建立大型训练数据集,因此其提取图像特征的能力受到很大限制。在本文中,我们提出了一种新的无监督学习CNN模型,该模型使用一种所谓的卷积稀疏自动编码器(CSAE)算法对CNN进行预训练。可以使用CSAE算法代替未标记的自然图像进行CNN训练,而可以使用未标记的人工图像来训练CNN,这可以轻松扩展训练数据和无监督学习。 CSAE算法专门用于从特定对象(例如汉字)中提取复杂特征。通过CSAE算法提取出人工图像的特征后,将学习到的参数用于初始化第一个CNN卷积层,然后使用带有线性分类器的场景图像小块对CNN模型进行精细训练。新的CNN模型应用于中文场景文本检测,并使用多语言图像数据集进行评估,该数据集分别标记中文,英文和数字文字。在两个CNN模型上观察到的检测精度提高了10%以上。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号