首页> 外文期刊>Bioinformatics >Compressed indexing and local alignment of DNA
【24h】

Compressed indexing and local alignment of DNA

机译:DNA的压缩索引和局部比对

获取原文
获取原文并翻译 | 示例
           

摘要

Motivation: Recent experimental studies on compressed indexes (BWT, CSA, FM-index) have confirmed their practicality for indexing very long strings such as the human genome in the main memory. For example, a BWT index for the human genome (with about 3 billion characters) occupies just around 1 G bytes. However, these indexes are designed for exact pattern matching, which is too stringent for biological applications. The demand is often on finding local alignments (pairs of similar substrings with gaps allowed). Without indexing, one can use dynamic programming to find all the local alignments between a text T and a pattern P in O(|T||P|) time, but this would be too slow when the text is of genome scale (e.g. aligning a gene with the human genome would take tens to hundreds of hours). In practice, biologists use heuristic-based software such as BLAST, which is very efficient but does not guarantee to find all local alignments.
机译:动机:最近对压缩索引(BWT,CSA,FM索引)的实验研究已证实它们在索引非常长的字符串(如人类记忆中的人类基因组)方面具有实用性。例如,人类基因组的BWT索引(约30亿个字符)仅占1 G字节左右。但是,这些索引是为精确模式匹配而设计的,这对于生物学应用来说太严格了。需求通常是在寻找局部比对(允许间隙的成对相似子串)。如果没有索引,则可以使用动态编程在O(| T || P |)时间内找到文本T和模式P之间的所有局部比对,但是当文本具有基因组规模时(例如,对齐)具有人类基因组的基因将需要数十到数百个小时。在实践中,生物学家使用基于启发式的软件(例如BLAST),该软件非常有效,但不能保证找到所有局部比对结果。

著录项

  • 来源
    《Bioinformatics》 |2008年第6期|p.791-797|共7页
  • 作者单位

    1Department of Computer Science, University of Hong Kong, Hong Kong, China and 2Department of Computer Science, National University of Singapore, Singapore;

  • 收录信息 美国《科学引文索引》(SCI);美国《化学文摘》(CA);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号