首页> 外文期刊>ACM Transactions on Information Systems >Practical Linear-Time O(1)-Workspace Suffix Sorting for Constant Alphabets
【24h】

Practical Linear-Time O(1)-Workspace Suffix Sorting for Constant Alphabets

机译:实用线性时间O(1)-工作空间后缀排序的恒定字母

获取原文
获取原文并翻译 | 示例

摘要

This article presents an O(n)-time algorithm called SACA-K for sorting the suffixes of an input string T [0, n- 1] over an alphabet A[0, K-1]. The problem of sorting the suffixes of T is also known as constructing the suffix array (SA) for T. The theoretical memory usage of SACA-K is n log K+n log n+Klog n bits. Moreover, we also have a practical implementation for SACA-K that uses n bytes + (n + 256) words and is suitable for strings over any alphabet up to full ASCII, where a word is log n bits. In our experiment, SACA-K outperforms SA-IS that was previously the most time- and space-efficient linear-time SA construction algorithm (SACA). SACA-K is around 33% faster and uses a smaller deterministic workspace of K words, where the workspace is the space needed beyond the input string and the output SA. Given K = 0(1), SACA-K runs in linear time and 0(1) workspace. To the best of our knowledge, such a result is the first reported in the literature with a practical source code publicly available.
机译:本文介绍了一种称为SACA-K的O(n)时间算法,用于对字母A [0,K-1]上的输入字符串T [0,n-1]的后缀进行排序。排序T后缀的问题也称为构造T的后缀数组(SA)。SACA-K的理论内存使用量为n log K + n log n + Klog n位。此外,我们还为SACA-K提供了一个实用的实现,它使用n个字节+(n + 256)个单词,适用于任何字母的字符串,直到完整ASCII,其中一个单词为log n位。在我们的实验中,SACA-K优于SA-IS,后者以前是时间和空间效率最高的线性时间SA构造算法(SACA)。 SACA-K的速度提高了约33%,并使用了K个词的较小的确定性工作空间,该工作空间是输入字符串和输出SA之外所需的空间。给定K = 0(1),SACA-K在线性时间和0(1)工作空间中运行。据我们所知,这样的结果是文献中首次报道的,并提供了可公开获得的实用源代码。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号