首页> 外文会议>International Conference on Advances in Computing, Communications and Informatics >A comparative study of two recent word spotting techniques in the run-length compressed domain
【24h】

A comparative study of two recent word spotting techniques in the run-length compressed domain

机译:在游程压缩域中两种最新的单词发现技术的比较研究

获取原文

摘要

This paper presents a comparative study of two recent word spotting techniques ([1] and [2]) directly in the run-length compressed domain. The first technique is based on partial decompression and limited usage of OCR, and the second technique is completely decompression-less and OCR-less. Both the word spotting techniques use word bounding box ratio feature initially for matching words in the database of compressed document images. For all the matching test-words, the word spotting strategy in the first model is to decompress and OCR first two characters, and then match with the keyword characters. If the matching is successful, then the remaining characters of the test-word are decompressed and OCRed, and eventually matched with the keyword. The word spotting strategy applied in the second model is to extract run based features like number of run transitions and the corresponding correlation of runs along the selected regions of the matching test word, and then match with that of the specified keyword. The proposed methods work in Run-Length Compressed Domain (RLCD) with the capability of operating on CCITT Group 3 1D, CCITT Group 3 2D, and CCITT Group 4 2D compressed documents supported by TIFF and PDF file formats. In the current paper, the efficacy of the proposed models is demonstrated through experimental results and comparative analysis.
机译:本文介绍了直接在游程长度压缩域中对两种最近的单词发现技术([1]和[2])的比较研究。第一种技术基于部分减压和有限的OCR使用,第二种技术则完全无减压且无OCR。两种单词发现技术最初都使用单词边界框比率功能来匹配压缩文档图像数据库中的单词。对于所有匹配的测试词,第一个模型中的单词发现策略是先解压缩和OCR前两个字符,然后再与关键字字符匹配。如果匹配成功,则将测试字的其余字符解压缩并进行OCRed处理,并最终与关键字匹配。在第二个模型中应用的单词发现策略是沿匹配的测试单词的选定区域提取基于运行的特征,例如运行过渡的次数和运行的相应相关性,然后与指定关键字的特征进行匹配。所提出的方法可在运行长度压缩域(RLCD)中工作,并且能够在TIFF和PDF文件格式支持的CCITT 3组1D,CCITT 3 2D组和CCITT 4 2D 2D压缩文档上运行。在本文中,通过实验结果和比较分析证明了所提模型的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号