...
首页> 外文期刊>Theoretical computer science >The affix array data structure and its applications to RNA secondary structure analysis
【24h】

The affix array data structure and its applications to RNA secondary structure analysis

机译:词缀数组数据结构及其在RNA二级结构分析中的应用

获取原文
获取原文并翻译 | 示例
           

摘要

Efficient string-processing in large data sets like complete genomes is strongly connected to the suffix tree and similar index data structures. With respect to complex structural string analysis like the search for RNA secondary structure patterns, unidirectional suffix tree algorithms are inferior to bidirectional algorithms based on the affix tree data structure. The affix tree incorporates the suffix tree and the suffix tree of the reverse text in one tree structure but suffers from its large memory requirements. In this paper I present a new data structure, denoted affix array, which is equivalent to the affix tree with respect to its algorithmic functionality, but with smaller memory requirements and improved performance. I will show a linear time construction of the affix array without making use of the linear time construction of the affix tree. I will also show how bidirectional affix tree traversals can be transferred to the affix array and present the impressive results of large scale RNA secondary structure analysis based on the new data structure. (c) 2007 Elsevier B.V. All rights reserved.
机译:大数据集(如完整的基因组)中的有效字符串处理与后缀树和类似的索引数据结构紧密相关。对于诸如搜索RNA二级结构模式的复杂结构字符串分析,单向后缀树算法不如基于词缀树数据结构的双向算法。词缀树在一个树结构中合并了反向文本的后缀树和后缀树,但存在内存需求大的问题。在本文中,我提出了一种新的数据结构,称为词缀数组,就其算法功能而言,它等效于词缀树,但具有较小的内存需求并提高了性能。我将展示词缀数组的线性时间构造,而不使用词缀树的线性时间构造。我还将展示如何将双向词缀树遍历传递到词缀数组,并展示基于新数据结构的大规模RNA二级结构分析的令人印象深刻的结果。 (c)2007 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号