首页> 外文会议>Data Compression Conference >A fast algorithm for making suffix arrays and for Burrows-Wheeler transformation
【24h】

A fast algorithm for making suffix arrays and for Burrows-Wheeler transformation

机译:一种快速算法,用于制作后缀阵列和挖掘机轮车变换

获取原文

摘要

We propose a fast and memory efficient algorithm for sorting suffixes of a text in lexicographic order. It is important to sort suffixes because an array of indexes of suffixes is called a suffix array and it is a memory efficient alternative of the suffix tree. Sorting suffixes is also used for the Burrows-Wheeler (see Technical Report 124, Digital SRC Research Report, 1994) transformation in the block sorting text compression, therefore fast sorting algorithms are desired. We compare algorithms for making suffix arrays of Bentley-Sedgewick (see Proceedings of the 8th Annual ACM-SIAM Symposium on Discrete Algorithms, p.360-9, 1997), Andersson-Nilsson (see 35th Symp. on Foundations of Computer Science, p.714-21, 1994) and Karp-Miller-Rosenberg (1972) and making suffix trees of Larsson (see Data Compression Conference, p.190-9, 1996) on the speed and required memory and propose a new algorithm which is fast and memory efficient by combining them. We also define a measure of difficulty of sorting suffixes: average match length. Our algorithm is effective when the average match length of a text is large, especially for large databases.
机译:我们提出了一种快速和记忆高效的算法,用于在词典顺序排序文本的后缀。重要的是排序后缀,因为后缀的索引数组称为后缀数组,它是后缀树的内存有效的替代方案。排序后缀也用于挖掘机轮车(参见技术报告124,数字SRC研究报告,1994)在块排序文本压缩中转换,因此需要快速排序算法。我们比较用于制作Bentley-Sedgewick的后缀阵列的算法(见第8届年度ACM-SIAM讨论会的第8届ACM-SIAM讨论会,Andersson-Nilsson(参见第35次Symp。计算机科学基础,P .714-21,1994)和Karp-Miller-Rosenberg(1972)并制作Larsson的后缀(参见数据压缩会议,P.190-9,1996)的速度和所需内存,并提出了一种快速的新算法通过组合它们,记忆力。我们还定义了排序后缀的难度:平均匹配长度。当文本的平均匹配长度很大时,我们的算法有效,尤其是大型数据库。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号