...
首页> 外文期刊>Multimedia, IEEE Transactions on >A New Technique for Multi-Oriented Scene Text Line Detection and Tracking in Video
【24h】

A New Technique for Multi-Oriented Scene Text Line Detection and Tracking in Video

机译:视频多方向场景文本行检测与跟踪的新技术

获取原文
获取原文并翻译 | 示例
           

摘要

Text detection and tracking in video is challenging due to contrast, resolution and background variations, and different orientations and text movements. In addition, the presence of both caption and scene texts in video aggravates the problem because these two text types differ in characteristics significantly . This paper proposes a new technique for detecting and tracking video texts of any orientation by using spatial and temporal information, respectively. The technique explores gradient directional symmetry at component level for smoothing edge components before text detection. Spatial information is preserved by forming Delaunay triangulation in a novel way at this level, which results in text candidates. Text characteristics are then proposed in a different way for eliminating false text candidates , which results in potential text candidates. Then grouping is proposed for combining potential text candidates regardless of orientation based on the nearest neighbor criterion. To tackle the problems of multi-font and multi-sized texts, we propose multi-scale integration by a pyramid structure, which helps in extracting full text lines. Then, the detected text lines are tracked in video by matching the subgraphs of triangulation. Experimental results for text detection and tracking on our video dataset, the benchmark video datasets, and the natural scene image benchmark datasets show that the proposed method is superior to the state-of-the-art methods in terms of recall, precision , and F-measure.
机译:由于对比度,分辨率和背景变化以及不同的方向和文本移动,视频中的文本检测和跟踪具有挑战性。另外,字幕和场景文本在视频中的存在加剧了该问题,因为这两种文本类型的特征差异很大。本文提出了一种新的技术,分别通过使用空间和时间信息来检测和跟踪任何方向的视频文本。该技术探索了组件级别的梯度方向对称性,以在文本检测之前平滑边缘组件。通过在此级别上以新颖的方式形成Delaunay三角剖分法,可以保留空间信息,从而生成候选文本。然后以不同的方式提出文本特征,以消除候选虚假文本,从而导致潜在的候选文本。然后提出了基于最近邻居准则的,用于组合潜在文本候选者而不考虑方向的分组方法。为了解决多字体和多尺寸文本的问题,我们建议通过金字塔结构进行多比例集成,这有助于提取全文行。然后,通过匹配三角剖分的子图,在视频中跟踪检测到的文本行。在我们的视频数据集,基准视频数据集和自然场景图像基准数据集上进行文本检测和跟踪的实验结果表明,该方法在查全率,精度和F方面均优于最新方法。 -测量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号