首页> 外文学位 >Compressed pattern matching for text and images.
【24h】

Compressed pattern matching for text and images.

机译:文本和图像的压缩模式匹配。

获取原文
获取原文并翻译 | 示例

摘要

In this dissertation, we have studied compressed pattern matching problem for both text and images. We present a series of novel compressed pattern matching algorithms, which are divided into two major parts. The first major work is done for the popular LZW compression algorithm. The second major work is done for the current lossless image compression standard JPEG-LS. Specifically, our contributions from the first major work are: (1) We have developed an "almost-optimal" compressed pattern matching algorithm that reports all pattern occurrences. An earlier "almost-optimal" algorithm reported in the literature is only capable of detecting the first occurrence of the pattern and the practical performance of the algorithm is not clear. We have implemented our algorithm and provide extensive experimental results measuring the speed of our algorithm. We also developed a faster implementation for so-called "simple patterns". The simple patterns are patterns that no unique symbol appears more than once. (2) We have developed a novel compressed pattern matching algorithm for multiple patterns using the Aho-Corasick algorithm. The algorithm takes O(mt + n + r) time with O(mt) extra space, where n is the size of the compressed file, m is the total size of all patterns, t is the size of the LZW trie and r is the number of occurrences of the patterns.; All the above algorithms have been implemented and extensive experiments have been conducted to test the performance of our algorithms and to compare with the best existing algorithms. The experimental results show that our compressed pattern matching algorithm for multiple patterns is competitive among the best algorithms and is practically the fastest among all approaches when the number of patterns is not very large. Therefore, our algorithm is preferable for general string matching applications. LZW is one of the most efficient and popular compression algorithms used extensively and both of our algorithms require no modification on the compression algorithm. Our work, therefore, has great economical and market potential.; Our contributions from the second major work are: (1) We have developed a new global context variation of the JPEG-LS compression algorithm and the corresponding compressed pattern matching algorithm. Comparing to the original JPEG-LS, the global context variation is search-aware and has faster encoding and decoding speeds. (2) We have developed a two-pass variation of the JPEG-LS algorithm and the corresponding compressed pattern matching algorithm. The two-pass variation achieves search-awareness through a common compression technique called semi static dictionary. (Abstract shortened by UMI.)
机译:本文研究了文本和图像的压缩模式匹配问题。我们提出了一系列新颖的压缩模式匹配算法,分为两个主要部分。流行的LZW压缩算法已完成了第一项主要工作。当前的无损图像压缩标准JPEG-LS已完成第二项主要工作。具体来说,我们从第一项主要工作中所做的贡献是:(1)我们开发了一种“几乎最佳”的压缩模式匹配算法,该算法报告了所有模式出现的情况。文献中报道的较早的“几乎最佳”算法仅能够检测模式的首次出现,并且该算法的实际性能尚不清楚。我们已经实现了算法,并提供了测量算法速度的广泛实验结果。我们还为所谓的“简单模式”开发了一种更快的实现。简单模式是没有唯一符号出现一次以上的模式。 (2)我们使用Aho-Corasick算法为多种模式开发了一种新颖的压缩模式匹配算法。该算法需要O(mt + n + r)的时间以及O(mt)的额外空间,其中n是压缩文件的大小,m是所有模式的总大小,t是LZW trie的大小,r是模式的出现次数。以上所有算法均已实施,并进行了广泛的实验,以测试我们算法的性能并与现有最佳算法进行比较。实验结果表明,当模式数量不是很大时,我们的多种模式的压缩模式匹配算法在最佳算法中具有竞争优势,并且在所有方法中实际上是最快的。因此,对于一般的字符串匹配应用程序,我们的算法更可取。 LZW是广泛使用的最高效,最流行的压缩算法之一,我们的两种算法都不需要修改压缩算法。因此,我们的工作具有巨大的经济和市场潜力。我们在第二项主要工作中所做的贡献是:(1)我们开发了JPEG-LS压缩算法和相应的压缩模式匹配算法的新全局上下文变体。与原始JPEG-LS相比,全局上下文变体具有搜索意识,并且具有更快的编码和解码速度。 (2)我们开发了JPEG-LS算法和相应的压缩模式匹配算法的两遍变体。两遍变体通过一种称为半静态字典的通用压缩技术来实现搜索意识。 (摘要由UMI缩短。)

著录项

  • 作者

    Tao, Tao.;

  • 作者单位

    University of Central Florida.;

  • 授予单位 University of Central Florida.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2005
  • 页码 151 p.
  • 总页数 151
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号