首页> 美国卫生研究院文献>BMC Bioinformatics >Efficient parallel and out of core algorithms for constructing large bi-directed de Bruijn graphs
【2h】

Efficient parallel and out of core algorithms for constructing large bi-directed de Bruijn graphs

机译:高效的并行核外算法用于构造大型双向de Bruijn图

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

BackgroundAssembling genomic sequences from a set of overlapping reads is one of the most fundamental problems in computational biology. Algorithms addressing the assembly problem fall into two broad categories >- based on the data structures which they employ. The first class uses an overlap/string graph and the second type uses a de Bruijn graph. However with the recent advances in short read sequencing technology, de Bruijn graph based algorithms seem to play a vital role in practice. Efficient algorithms for building these massive de Bruijn graphs are very essential in large sequencing projects based on short reads. In an earlier work, an O(n/p) time parallel algorithm has been given for this problem. Here n is the size of the input and p is the number of processors. This algorithm enumerates all possible bi-directed edges which can overlap with a node and ends up generating Θ(nΣ) messages (Σ being the size of the alphabet).
机译:背景技术从一组重叠的读段组装基因组序列是计算生物学中最基本的问题之一。基于组装问题的算法根据其使用的数据结构分为两大类>-。第一类使用重叠/字符串图,第二类使用de Bruijn图。但是,随着短读测序技术的最新发展,基于de Bruijn图的算法似乎在实践中起着至关重要的作用。在基于短读取的大型测序项目中,构建这些大规模de Bruijn图的高效算法非常重要。在较早的工作中,已针对此问题给出了O(n / p)时间并行算法。这里n是输入的大小,p是处理器的数量。该算法枚举了可能与节点重叠的所有可能的双向边缘,并最终生成了Θ(nΣ)消息(Σ是字母的大小)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号