首页> 外文期刊>Bioinformatics >Fast tandem mass spectra-based protein identification regardless of the number of spectra or potential modifications examined
【24h】

Fast tandem mass spectra-based protein identification regardless of the number of spectra or potential modifications examined

机译:基于串联质谱的快速蛋白质鉴定,无论所检查的光谱数或潜在的修饰量如何

获取原文
获取原文并翻译 | 示例
       

摘要

Motivation: Comparing tandem mass spectra (MSMS) against a known dataset of protein sequences is a common method for identifying unknown proteins; however, the processing of MSMS by current software often limits certain applications, including comprehensive coverage of post-translational modifications, non-specific searches and real-time searches to allow result-dependent instrument control. This problem deserves attention as new mass spectrometers provide the ability for higher throughput and as known protein datasets rapidly grow in size. New software algorithms need to be devised in order to address the performance issues of conventional MSMS protein dataset-based protein identification.Methods: This paper describes a novel algorithm based on converting a collection of monoisotopic, centroided spectra to a new data structure, named 'peptide finite state machine' (PFSM), which may be used to rapidly search a known dataset of protein sequences, regardless of the number of spectra searched or the number of potential modifications examined. The algorithm is verified using a set of commercially available tryptic digest protein standards analyzed using an ABI 4700 MALDI TOFTOF mass spectrometer, and a free, open source PFSM implementation. It is illustrated that a PFSM can accurately search large collections of spectra against large datasets of protein sequences (e.g. NCBI nr) using a regular desktop PC; however, this paper only details the method for identifying peptide and subsequently protein candidates from a dataset of known protein sequences. The concept of using a PFSM as a peptide pre-screening technique for MSMS-based search engines is validated by using PFSM with Mascot and XTandem.
机译:动机:将串联质谱图(MSMS)与已知的蛋白质序列数据集进行比较是识别未知蛋白质的常用方法。但是,当前软件对MSMS的处理通常会限制某些应用程序,包括对翻译后修饰,非特定搜索和实时搜索的全面介绍,以实现与结果相关的仪器控制。这个问题值得关注,因为新的质谱仪提供了更高的通量,而已知的蛋白质数据集的大小也在迅速增长。为了解决常规基于MSMS蛋白质数据集的蛋白质鉴定的性能问题,需要设计新的软件算法。方法:本文介绍了一种基于将单同位素质心光谱集合转换为新数据结构的新算法,该结构称为'肽有限状态机(PFSM),可用于快速搜索已知的蛋白质序列数据集,而不管所搜索的光谱数量或所检查的潜在修饰数量如何。使用一套可商购的胰蛋白酶消化蛋白标准品(使用ABI 4700 MALDI TOFTOF质谱仪和免费的开源PFSM实施方案进行分析)验证了该算法。举例说明,PFSM可以使用常规台式PC准确地针对蛋白质序列的大型数据集(例如NCBI nr)准确地搜索大量光谱;然而,本文仅详述了从已知蛋白质序列的数据集中识别肽和随后的候选蛋白质的方法。通过将PFSM与Mascot和XTandem结合使用,可以验证使用PFSM作为基于MSMS的搜索引擎的肽段预筛选技术的概念。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号