LBE: A Computational Load Balancing Algorithm for Speeding up Parallel Peptide Search in Mass-Spectrometry Based Proteomics

机译：LBE：一种基于质谱的蛋白质组学方法，用于加快并行肽搜索的计算负载平衡算法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The most commonly employed method for peptide identification in mass-spectrometry based proteomics involves comparing experimentally obtained tandem MS/MS spectra against a set of theoretical MS/MS spectra. The theoretical MS/MS spectra data are predicted using protein sequence database. Most state-of-the-art peptide search algorithms index theoretical spectra data to quickly filter-in the relevant (similar) indexed spectra when searching an experimental MS/MS spectrum. Data filtration substantially reduces the required number of computationally expensive spectrum-to-spectrum comparison operations. However, the number of predicted (and indexed) theoretical spectra grows exponentially with increase in post-translational modifications creating a memory and I/O bottleneck. In this paper, we present a parallel algorithm, called LBE, for efficient partitioning of theoretical spectra data on a distributed-memory architecture. Our proposed algorithm first groups the similar theoretical spectra. The groups are then finely split across the system allowing machines to perform almost equal amount of work when querying a MS/MS spectrum. Our results show that the compute load imbalance using LBE based data distribution is ≤ 20% allowing speedups of order of magnitudes over existing methods. The proposed algorithm has been implemented on a compute cluster using MPI library. Experimental results for increasing index sizes are reported in terms of execution time, speedups and memory footprint. To the best of our knowledge, LBE is the first load-balancing technique for MS/MS proteomics data on memory-distributed clusters that incorporates proteomics domain knowledge for efficient load-balancing. Source code is made available at: https://github.com/pcdslab/lbdslim/tree/mpi.

机译：在基于质谱的蛋白质组学中，最常用的肽段鉴定方法包括将实验获得的串联MS / MS光谱与一组理论MS / MS光谱进行比较。使用蛋白质序列数据库可预测理论的MS / MS光谱数据。大多数最新的肽搜索算法都对理论光谱数据进行索引，以在搜索实验性MS / MS光谱时快速过滤相关（相似）索引的光谱。数据过滤大大减少了所需的计算量大的频谱间比较操作的数量。但是，随着翻译后修饰的增加，预测的（和索引化的）理论光谱的数量呈指数增长，从而产生内存和I / O瓶颈。在本文中，我们提出了一种称为LBE的并行算法，用于在分布式内存体系结构上有效分割理论光谱数据。我们提出的算法首先对相似的理论光谱进行分组。然后，将这些组在整个系统中细分，使机器在查询MS / MS频谱时可以执行几乎相等的工作量。我们的结果表明，使用基于LBE的数据分布的计算负载不平衡度≤20％，与现有方法相比，可加快数量级的速度。所提出的算法已使用MPI库在计算集群上实现。报告了增加索引大小的实验结果，包括执行时间，加速和内存占用情况。据我们所知，LBE是第一种用于内存分布式群集上的MS / MS蛋白质组学数据的负载平衡技术，该技术结合了蛋白质组学领域知识以实现有效的负载平衡。源代码位于：https://github.com/pcdslab/lbdslim/tree/mpi。

著录项

来源
《IEEE International Parallel and Distributed Processing Symposium Workshops》|2019年|191-198|共8页
会议地点
作者
Muhammad Haseeb; Fatima Afzali; Fahad Saeed;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Peptides; Indexes; Clustering algorithms; Partitioning algorithms; Proteomics; Distributed databases;

机译：肽;索引;聚类算法;分区算法;蛋白质组学;分布式数据库;

相似文献

外文文献
中文文献
专利

1. Computational Methods and Algorithms for Mass-Spectrometry Based Differential Proteomics [J] . Mavroudi Seferina, Papadimitriou Stergios, Kossida Sophia, Current proteomics . 2007,第4期

机译：基于质谱的蛋白质组学的计算方法和算法
2. Computational Methods and Algorithms for Mass-Spectrometry Based Differential Proteomics [J] . Seferina Mavroudi Stergios Papadimitriou Sophia Kossida Spyridon D. Likothanassis Antonia Vlahou Current Proteomics . 2007,第4期

机译：基于质谱的微分蛋白质组学的计算方法和算法
3. CLUSTER OF WORKSTATIONS BASED ON DYNAMIC LOAD BALANCING FOR PARALLEL TREE COMPUTATION DEPTH-FIRST-SEARCH [J] . 加力, 陆鑫达, 张健上海交通大学学报（英文版） . 2002,第001期
4. LBE: A Computational Load Balancing Algorithm for Speeding up Parallel Peptide Search in Mass-Spectrometry Based Proteomics [C] . Muhammad Haseeb, Fatima Afzali, Fahad Saeed IEEE International Parallel and Distributed Processing Symposium Workshops . 2019

机译：LBE：一种计算负载平衡算法，用于加速质谱基蛋白质组学中的平行肽搜索
5. Integrating algorithmic and systemic load balancing strategies in parallel scientific applications. [D] . Ghafoor, Sheikh Khaled. 2003

机译：在并行科学应用中集成算法和系统负载均衡策略。
6. Tabu search based global optimization algorithms for problems in computational chemistry [O] . Christoph Grebner, Johannes Becker, Daniel Weber, 2012

机译：基于禁忌搜索的全局优化算法用于计算化学问题
7. LBE: A Computational Load Balancing Algorithm for Speeding up Parallel Peptide Search in Mass-Spectrometry Based Proteomics [O] . Muhammad Haseeb, Fatima Afzali, Fahad Saeed 2019

机译：LBE：一种计算负载平衡算法，用于加速质谱基蛋白质组学中的平行肽搜索
8. Load-balancing algorithms for the parallel community climate model [R] . Foster, I. T. , Toonen, B. R. 1995

机译：并行社区气候模型的负载平衡算法

LBE: A Computational Load Balancing Algorithm for Speeding up Parallel Peptide Search in Mass-Spectrometry Based Proteomics

摘要

著录项

相似文献

相关主题

期刊订阅