首页> 外文会议>Annual symposium on combinatorial pattern matching >Efficient Construction of a Compressed de Bruijn Graph for Pan-Genome Analysis
【24h】

Efficient Construction of a Compressed de Bruijn Graph for Pan-Genome Analysis

机译:用于全基因组分析的压缩de Bruijn图的有效构造

获取原文

摘要

Recently, Marcus et al. (Bioinformatics 2014) proposed to use a compressed de Bruijn graph of maximal exact matches to describe the relationship between the genomes of many individuals/strains of the same or closely related species. They devised an O(n log g) time algorithm called splitMEM that constructs this graph directly (i.e., without using the uncompressed de Bruijn graph) based on a suffix tree, where n is the total length of the genomes and g is the length of the longest genome. In this paper, we present an algorithm that outperforms their algorithm in theory and in practice. More precisely, our algorithm has a better worst-case time complexity of O(n log σ), where σ is the size of the alphabet (σ = 4 for DNA). Moreover, experiments show that it is much faster than splitMEM while using only a fraction of the space required by splitMEM.
机译:最近,Marcus等人。 (Bioinformatics 2014)建议使用最大精确匹配的压缩de Bruijn图来描述相同或紧密相关物种的许多个体/株的基因组之间的关系。他们设计了一种称为splitMEM的O(n log g)时间算法,该算法基于后缀树直接构造该图(即,不使用未压缩的de Bruijn图),其中n是基因组的总长度,g是基因组的长度。最长的基因组。在本文中,我们提出了一种在理论上和实践上均优于其算法的算法。更准确地说,我们的算法具有更好的最坏情况下的时间复杂度O(n logσ),其中σ是字母的大小(对于DNA,σ= 4)。而且,实验表明它比splitMEM快得多,而仅占用splitMEM所需空间的一小部分。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号