A New Hybrid Method for Caption and Scene Text Classification in Action Video Images

Nandanwar Lokesh; Shivakumara Palaiahnakote; Pal Umapada; Lu Tong; Blumenstein Michael

首页> 外文期刊>International Journal of Pattern Recognition and Artificial Intelligence >A New Hybrid Method for Caption and Scene Text Classification in Action Video Images

【24h】

A New Hybrid Method for Caption and Scene Text Classification in Action Video Images

机译：动作视频图像中标题和场景文本分类的新混合方法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Achieving a better recognition rate for text in action video images is challenging due to multiple types of text with unpredictable actions in the background. In this paper, we propose a new method for the classification of caption (which is edited text) and scene text (text that is a part of the video) in video images. This work considers five action classes, namely, Yoga, Concert, Teleshopping, Craft, and Recipes, where it is expected that both types of text play a vital role in understanding the video content. The proposed method introduces a new fusion criterion based on Discrete Cosine Transform (DCT) and Fourier coefficients to obtain the reconstructed images for caption and scene text. The fusion criterion involves computing the variances for coefficients of corresponding pixels of DCT and Fourier images, and the same variances are considered as the respective weights. This step results in Reconstructed image-1. Inspired by the special property of Chebyshev-Harmonic-Fourier-Moments (CHFM) that has the ability to reconstruct a redundancy-free image, we explore CHFM for obtaining the Reconstructed image-2. The reconstructed images along with the input image are passed to a Deep Convolutional Neural Network (DCNN) for classification of caption/scene text. Experimental results on five action classes and a comparative study with the existing methods demonstrate that the proposed method is effective. In addition, the recognition results of the before and after the classification obtained from different methods show that the recognition performance improves significantly after classification, compared to before classification.

机译：由于多种类型的文本在背景中实现了多种文本，因此实现了更好的动作视频图像识别率是具有挑战性的。在本文中，我们提出了一种新方法，用于分类标题（被编辑的文本）和视频图像中的场景文本（是视频的一部分的文本）。这项工作考虑了五个动作课程，即瑜伽，音乐会，电视电像，工艺和食谱，预计这两种类型的文本在理解视频内容方面发挥着重要作用。该方法基于离散余弦变换（DCT）和傅里叶系数引入了新的融合标准，以获得用于标题和场景文本的重建图像。融合标准涉及计算DCT和傅里叶图像的相应像素的系数的差异，并且与相同的差异被认为是相应的权重。该步骤导致重建图像-1。灵感来自Chebyshev-Harmonic-Fourtime（CHFM）的特殊属性，可以重建无冗余图像，我们探索CHFM获取重建的图像-2。重建的图像以及输入图像被传递给深度卷积神经网络（DCNN），用于分类字幕/场景文本。对现有方法的五种动作类和比较研究的实验结果表明，所提出的方法是有效的。此外，与不同方法获得的分类之前和之后的识别结果表明，与分类之前，分类后识别性能显着提高。

著录项

来源
《International Journal of Pattern Recognition and Artificial Intelligence》 |2021年第12期|2160009.1-2160009.23|共23页
作者
Nandanwar Lokesh; Shivakumara Palaiahnakote; Pal Umapada; Lu Tong; Blumenstein Michael;
展开▼
作者单位

Univ Malaya Fac Comp Sci & Informat Technol Kuala Lumpur Malaysia;

Univ Malaya Fac Comp Sci & Informat Technol Kuala Lumpur Malaysia;

Indian Stat Inst Comp Vis & Pattern Recognit Unit Kolkata India;

Nanjing Univ Natl Key Lab Novel Software Technol Nanjing Peoples R China;

Univ Technol Sydney Sydney NSW Australia;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Caption text; scene text; fusion; DCT coefficients; Chebyshev-Harmonic-Fourier-moments; caption and scene text classification; action image recognition;

机译：标题文本;场景文本;融合;DCT系数;Chebyshev-Harmonic-Fourtimention;标题和场景文本分类;动作图像识别;
入库时间 2022-08-19 03:26:14

相似文献

外文文献
中文文献
专利

1. A learning-based method to detect and segment text from scene images [J] . JIANG Ren-jie, QI Fei-hu, XU Li, 浙江大学学报（英文版）（A辑：应用物理和工程） . 2007,第004期
2. Video learning based image classification method for object recognition [J] . LEE Hong-ro, SHIN Yong-ju 中南大学学报（英文版） . 2013,第009期
3. Polarimetric Synthetic Aperture Radar Image Classification by a Hybrid Method [J] . Kamran Ullah Khan, YANG Jian 清华大学学报（英文版） . 2007,第001期
4. Captioning Videos Using Large-Scale Image Corpus [J] . Xiao-Yu Du, Yang Yang, Liu Yang, 计算机科学技术学报（英文版） . 2017,第003期
5. A Caption Text Detection Method from Images/Videos for Efficient Indexing and Retrieval of Multimedia Data [J] . Samabia Tehsin, Asif Masood, Sumaira Kausar, International Journal of Pattern Recognition and Artificial Intelligence . 2015,第1期

机译：从图像/视频的字幕文本检测方法，以有效地索引和检索多媒体数据
6. A new method for multi-oriented graphics-scene-3D text classification in video [J] . Xu Jiamin, Shivakumara Palaiahnakote, Lu Tong, Pattern Recognition: The Journal of the Pattern Recognition Society . 2016,第Null期

机译：视频多方向图形场景3D文本分类的新方法
7. Gradient Vector Flow and Grouping-Based Method for Arbitrarily Oriented Scene Text Detection in Video Images [J] . Shivakumara P., Phan T.Q., Lu S., IEEE Transactions on Circuits and Systems for Video Technology . 2013,第10期

机译：基于梯度矢量流和分组的视频图像场景文本任意检测方法
8. A New DCT-FFT Fusion Based Method for Caption and Scene Text Classification in Action Video Images [C] . Lokesh Nandanwar, Palaiahnakote Shivakumara, Suvojit Manna, International Conference on Pattern Recognition and Artificial Intelligence . 2020

机译：基于新的DCT-FFT融合在动作视频图像中的标题和场景文本分类方法
9. The effects of video and captioned text and the influence of verbal and spatial abilities on second language listening comprehension in a multimedia learning environment. [D] . Hernandez, Sylvia Sepulveda. 2004

机译：在多媒体学习环境中，视频和带字幕的文本的影响以及语言和空间能力对第二语言听力理解的影响。
10. A Hybrid Geometric Spatial Image Representation for scene classification [O] . Nouman Ali, Bushra Zafar, Faisal Riaz, 2012

机译：用于场景分类的混合几何空间图像表示
11. VIDEO SCENE DETECTION USING CLOSED CAPTION TEXT [O] . Smith Gregory 2009

机译：使用隐藏字幕文字的视频场景检测
12. Method and Apparatus for Recognizing Text in an Image Sequence of Scene Imagery. [R] . Myers, G. K., Bolles, R. C., Luong, Q. T., 2006

机译：用于识别场景图像的图像序列中的文本的方法和装置。

A New Hybrid Method for Caption and Scene Text Classification in Action Video Images

摘要

著录项

相似文献

相关主题

期刊订阅