首页> 外文会议>IEEE International Conference on Bioinformatics and Bioengineering >Towards Centralized MS/MS Spectra Preprocessing: An Empirical Evaluation of Peptides Search Engines using Ground Truth Datasets
【24h】

Towards Centralized MS/MS Spectra Preprocessing: An Empirical Evaluation of Peptides Search Engines using Ground Truth Datasets

机译:迈向集中式MS / MS光谱预处理:使用地面真相数据集的肽搜索引擎的经验评估

获取原文

摘要

several peptides search engines have been developed in the recent decades. Most of the time and for the same inputs, different search engines’ result in different peptides were identified, which can confuse the stakeholders in the field of proteomics. The massive amount of generated spectra by high throughput spectrometers adds another challenge which handicaps the current search engines. This motivates the researchers to evaluate the combination of several search engines. Several studies provided ensemble solutions over shared and distributed computing environments for reliable results. However, the massive amount of MS/MS spectra is a cumbersome traffic over the systems’ networks. This issue directly impacts the searching performance and also adds unnecessary extra costs (computing, storage, network traffic) if cloud cluster is being used. The main question of this paper is: Can we build a central MS/MS spectra preprocessing for semantically different protein search engines? We evaluate different statistical reduction techniques using four popular protein search engines. In order to fairly evaluate the results, we build ground truth unanimous-based datasets for two different species; yeast and human. Our techniques result in significant peak reduction, where only around 30% of the spectra peaks are enough to report reliable identifications from the used search engines in this study.
机译:在最近的几十年中,已经开发了几种肽搜索引擎。在大多数情况下,对于相同的输入,会识别出不同的搜索引擎导致产生不同的肽,这可能会使蛋白质组学领域的利益相关者感到困惑。高通量光谱仪产生的大量光谱增加了另一个挑战,这阻碍了当前的搜索引擎。这激励研究人员评估几种搜索引擎的组合。多项研究提供了在共享和分布式计算环境上的集成解决方案,以获得可靠的结果。但是,大量的MS / MS频谱是系统网络上繁琐的流量。如果正在使用云群集,此问题将直接影响搜索性能,并且还会增加不必要的额外成本(计算,存储,网络流量)。本文的主要问题是:我们可以为语义上不同的蛋白质搜索引擎构建中央MS / MS谱图预处理吗?我们使用四个流行的蛋白质搜索引擎评估不同的统计归约技术。为了公平地评估结果,我们为两个不同的物种建立了基于地面一致数据的数据集。酵母和人。我们的技术可显着减少峰,在此研究中,只有大约30%的光谱峰足以报告使用的搜索引擎提供的可靠标识。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号