Accelerating String Set Matching in FPGA Hardware for Bioinformatics Research

Yoginder S Dandass; Shane C Burgess; Mark Lawrence; Susan M Bridges

首页> 外文期刊>BMC Bioinformatics >Accelerating String Set Matching in FPGA Hardware for Bioinformatics Research

【24h】

Accelerating String Set Matching in FPGA Hardware for Bioinformatics Research

机译：在FPGA硬件中加速字符串集匹配以进行生物信息学研究

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Background This paper describes techniques for accelerating the performance of the string set matching problem with particular emphasis on applications in computational proteomics. The process of matching peptide sequences against a genome translated in six reading frames is part of a proteogenomic mapping pipeline that is used as a case-study. The Aho-Corasick algorithm is adapted for execution in field programmable gate array (FPGA) devices in a manner that optimizes space and performance. In this approach, the traditional Aho-Corasick finite state machine (FSM) is split into smaller FSMs, operating in parallel, each of which matches up to 20 peptides in the input translated genome. Each of the smaller FSMs is further divided into five simpler FSMs such that each simple FSM operates on a single bit position in the input (five bits are sufficient for representing all amino acids and special symbols in protein sequences). Results This bit-split organization of the Aho-Corasick implementation enables efficient utilization of the limited random access memory (RAM) resources available in typical FPGAs. The use of on-chip RAM as opposed to FPGA logic resources for FSM implementation also enables rapid reconfiguration of the FPGA without the place and routing delays associated with complex digital designs. Conclusion Experimental results show storage efficiencies of over 80% for several data sets. Furthermore, the FPGA implementation executing at 100 MHz is nearly 20 times faster than an implementation of the traditional Aho-Corasick algorithm executing on a 2.67 GHz workstation.

机译：背景技术本文介绍了用于加速字符串集匹配问题性能的技术，尤其着重于计算蛋白质组学中的应用。肽序列与在六个阅读框中翻译的基因组匹配的过程是蛋白基因组作图流水线的一部分，用作案例研究。 Aho-Corasick算法适用于以优化空间和性能的方式在现场可编程门阵列（FPGA）设备中执行。在这种方法中，传统的Aho-Corasick有限状态机（FSM）被拆分为多个较小的FSM，它们并行运行，每个FSM与输入翻译基因组中的多达20个肽段匹配。每个较小的FSM进一步分为五个更简单的FSM，以使每个简单FSM在输入中的单个位上运行（五个位足以代表蛋白质序列中的所有氨基酸和特殊符号）。结果通过Aho-Corasick实现的按位拆分组织，可以有效利用典型FPGA中可用的有限随机存取存储器（RAM）资源。使用片上RAM而不是FPGA逻辑资源来实现FSM，还可以快速重新配置FPGA，而不会出现与复杂数字设计相关的布局和布线延迟。结论实验结果表明，多个数据集的存储效率超过80％。此外，以100 MHz执行的FPGA实现比在2.67 GHz工作站上执行的传统Aho-Corasick算法的实现快近20倍。

著录项

来源
《BMC Bioinformatics》 |2008年第1期|共页
作者
Yoginder S Dandass; Shane C Burgess; Mark Lawrence; Susan M Bridges;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类生物科学;
关键词

相似文献

外文文献
中文文献
专利

1. Hardware-Software Codesign Based Accelerated and Reconfigurable Methodology for String Matching in Computational Bioinformatics Applications [J] . Gudur Venkateshwarlu Y., Acharyya Amit IEEE/ACM transactions on computational biology and bioinformatics . 2020,第4期

机译：基于硬件 - 软件代码的计算生物信息应用程序中的字符串匹配的加速与可重新配置方法
2. Tree-based string pattern matching on FPGAs [J] . Erdem Oguzhan Computers and Electrical Engineering . 2016,第Null期

机译：FPGA上基于树的字符串模式匹配
3. Optimizing Hardware Accelerated General Matrix-Matrix Multiplication for CNNs on FPGAs [J] . Afzal Ahmad, Muhammad Adeel Pasha Circuits and Systems II: Express Briefs, IEEE Transactions on . 2020,第11期

机译：优化硬件加速FPGA上CNN的常规矩阵矩阵乘法
4. Accelerated reconfigurable string matching using hardware-software codesign for computational bioinformatics applications [C] . Venkateshwarlu Y. Gudur, Amit Acharyya European Conference on Circuit Theory and Design . 2017

机译：使用硬件-软件代号来加速可重新配置的字符串匹配，以用于计算生物信息学应用
5. FPGA Accelerated Bioinformatics: Alignment, Classification, Homology and Counting [D] . McVicar, Nathaniel. 2018

机译：FPGA加速生物信息学：对齐，分类，同源性和计数
6. Accelerating String Set Matching in FPGA Hardware for Bioinformatics Research [O] . Yoginder S Dandass, Shane C Burgess, Mark Lawrence, 2008

机译：在FPGA硬件中加速字符串集匹配以进行生物信息学研究
7. Accelerating String Set Matching in FPGA Hardware for Bioinformatics Research [O] . Yoginder S Dandass, Shane C Burgess, Mark Lawrence, 2008

机译：在FPGA硬件中加速字符串集匹配以进行生物信息学研究

Accelerating String Set Matching in FPGA Hardware for Bioinformatics Research

摘要

著录项

相似文献

相关主题

期刊订阅