首页> 外文期刊>Journal of computer and system sciences >Linking indexing data structures to de Bruijn graphs: Construction and update
【24h】

Linking indexing data structures to de Bruijn graphs: Construction and update

机译:将索引数据结构链接到De Bruijn图表:施工和更新

获取原文
获取原文并翻译 | 示例

摘要

DNA sequencing technologies have tremendously increased their throughput, and hence complicated DNA assembly. Numerous assembly programs use de Bruijn graphs (dBG) built from short reads to merge these into contigs, which represent putative DNA segments. In a dBG of order k, nodes are substrings of length k of reads (or k-mers), while arcs are their k + 1-mers. As analysing reads often require to index all their substrings, it is interesting to exhibit algorithms that directly build a dBG from a pre-existing index, and especially a contracted dBG, where non-branching paths are condensed into single nodes. Here, we exhibit linear time algorithms for constructing the full or contracted dBGs from suffix trees, suffix arrays, and truncated suffix trees. With the latter the construction uses a space that is linear in the size of the dBG. Finally, we also provide algorithms to dynamically update the order of the graph without reconstructing it. (C) 2016 The Author(s). Published by Elsevier Inc.
机译:DNA测序技术对其吞吐量产生了极大的增加,因此复杂的DNA组装。众多装配计划使用从短读取构建的De Bruijn图表(DBG),以将它们合并到Contig中,其代表推定的DNA段。在订单k的DBG中,节点是读取的长度k的子串(或k-mers),而弧是它们的k + 1-mers。由于分析读取通常需要索引它们的所有子字符串,因此有趣的是展示从预先存在的索引中直接构建DBG的算法,尤其是具有收缩的DBG,其中非分支路径被凝结成单个节点。在这里,我们展示了用于从后缀树,后缀阵列和截断后缀树构造完整或收缩的DBG的线性时间算法。随着后者的施工使用在DBG大小的线性的空间。最后,我们还提供算法以动态更新图形的顺序而无需重建它。 (c)2016提交人。 elsevier公司出版

著录项

  • 来源
    《Journal of computer and system sciences》 |2019年第9期|165-183|共19页
  • 作者单位

    CNRS LIRMM 161 Rue Ada F-34095 Montpellier 5 France|Univ Montpellier 161 Rue Ada F-34095 Montpellier 5 France|CNRS Inst Biol Computat 860 Rue St Priest F-34095 Montpellier 5 France|Univ Montpellier 860 Rue St Priest F-34095 Montpellier 5 France;

    Normandie Univ F-76000 Rouen France|UNIROUEN UNIHAVRE INSA Rouen LITIS F-76000 Rouen France;

    CNRS LIRMM 161 Rue Ada F-34095 Montpellier 5 France|Univ Montpellier 161 Rue Ada F-34095 Montpellier 5 France|CNRS Inst Biol Computat 860 Rue St Priest F-34095 Montpellier 5 France|Univ Montpellier 860 Rue St Priest F-34095 Montpellier 5 France;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Index; Data structure; Suffix tree; Suffix array; Dynamic update; Overlap; Contracted de Bruijn graph; Assembly; Algorithms; Bioinformatics;

    机译:索引;数据结构;后缀树;后缀数组;动态更新;重叠;收缩的de bruijn图;组装;算法;生物信息学;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号