A Deep Convolutional Deblurring and Detection Neural Network for Localizing Text in Videos

机译：用于在视频中定位文本的深度卷积去模糊和检测神经网络

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Scene text in the video is usually vulnerable to various blurs like those caused by camera or text motions, which brings additional difficulty to reliably extract them from the video for content-based video applications. In this paper, we propose a novel fully convolutional deep neural network for deblurring and detecting text in the video. Specifically, to cope with blur of video text, we propose an effective deblurring subnetwork that is composed of multi-level convolutional blocks with both cross-block (long) and within-block (short) skip connections for progressively learning residual deblurred image details as well as a spatial attention mechanism to pay more attention on blurred regions, which generates the sharper image for current frame by fusing multiple surrounding adjacent frames. To further localize text in the frames, we enhance the EAST text detection model by introducing deformable convolution layers and deconvolution layers, which better capture widely varied appearances of video text. Experiments on the public scene text video dataset demonstrate the state-of-the-art performance of the proposed video text deblurring and detection model.

机译：视频中的场景文本通常容易受到各种模糊的影响，例如由照相机或文本运动引起的模糊，这给基于内容的视频应用程序从视频中可靠地提取它们带来了额外的困难。在本文中，我们提出了一种新颖的全卷积深度神经网络，用于对视频中的文本进行去模糊和检测。具体来说，为了解决视频文本的模糊问题，我们提出了一种有效的去模糊子网，该子网由具有交叉块（长）和块内（短）跳过连接的多级卷积块组成，用于逐步学习残差图像细节以及一种空间注意力机制来对模糊区域给予更多关注，该机制通过融合周围的多个相邻帧为当前帧生成更清晰的图像。为了进一步在帧中定位文本，我们通过引入可变形卷积层和反卷积层来增强EAST文本检测模型，以更好地捕获视频文本的多种外观。在公共场景文本视频数据集上进行的实验证明了所提出的视频文本去模糊和检测模型的最新性能。

著录项

来源
《International Conference on Multimedia Modeling》|2020年|112-124|共13页
会议地点
作者
Yang Wang; Ye Qian; Jiahao Shi; Feng Su;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Scene text; Detection; Deblurring; Video; Fully convolutional network;

机译：场景文字;检测;去模糊;视频;全卷积网络;
入库时间 2022-08-26 13:55:04

相似文献

外文文献
中文文献
专利

1. Ear Detection and Localization with Convolutional Neural Networks in Natural Images and Videos [J] . William Raveane, Pedro Luis Galdámez, María Angélica González Arrieta Processes . 2019,第7期

机译：卷积神经网络在自然图像和视频中的耳朵检测和定位
2. Multi-label semantic concept detection in videos using fusion of asymmetrically trained deep convolutional neural networks and foreground driven concept co-occurrence matrix [J] . Janwe Nitin J., Bhoyar Kishor K. Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies . 2018,第8期

机译：使用非对称训练的深卷积神经网络和前景驱动概念共发生矩阵的视频中的多标签语义概念检测
3. Text Detection and Recognition for Natural Scene Images Using Deep Convolutional Neural Networks [J] . Xianyu Wu, Chao Luo, Qian Zhang, Computers, Materials & Continua . 2019,第1期

机译：使用深卷积神经网络的自然场景图像的文本检测与识别
4. A Deep Convolutional Deblurring and Detection Neural Network for Localizing Text in Videos [C] . Yang Wang, Ye Qian, Jiahao Shi, International Conference on Multimedia Modeling . 2020

机译：用于本地化视频文本的深度卷积去束和检测神经网络
5. DeepFakes Detection in Videos Using Feature Engineering Techniques in Deep Learning Convolution Neural Network Frameworks [D] . Burroughs, Sonya. 2021

机译：使用深度学习卷积神经网络框架的特征工程技术在视频中检测视频
6. Deblurring adaptive optics retinal images using deep convolutional neural networks [O] . Xiao Fei, Junlei Zhao, Haoxin Zhao, 2017

机译：使用深度卷积神经网络对自适应光学视网膜图像进行模糊处理
7. Convolutional Neural Networks for Direct Text Deblurring [O] . Michal Hradiš, Jan Kotera, Pavel Zemčík, 2015

机译：直接文本解训的卷积神经网络

A Deep Convolutional Deblurring and Detection Neural Network for Localizing Text in Videos

摘要

著录项

相似文献

相关主题

期刊订阅