首页> 外文会议>Pacific Rim international conference on artificial intelligence >Improving the Results of De novo Peptide Identification via Tandem Mass Spectrometry Using a Genetic Programming-Based Scoring Function for Re-ranking Peptide-Spectrum Matches
【24h】

Improving the Results of De novo Peptide Identification via Tandem Mass Spectrometry Using a Genetic Programming-Based Scoring Function for Re-ranking Peptide-Spectrum Matches

机译:利用基于遗传编程的评分函数来改善通过串联质谱法通过基于肽的评分函数改善肽谱匹配的结果

获取原文

摘要

De novo peptide sequencing algorithms have been widely used in proteolytics to analyse tandem mass spectra (MS/MS) and assign them to peptides, but quality-control methods to evaluate the confidence of de novo peptide sequencing are lagging behind. A fundamental part of a quality-control method is the scoring function used to evaluate the quality of peptide-spectrum matches (PSMs). Here, we propose a genetic programming (GP) based method, called GP-PSM, to learn a PSM scoring function for improving the rate of confident peptide identification from MS/MS data. The CP method learns from thousands of MS/MS spectra. Important characteristics about goodness of the matches are extracted from the learning set and incorporated into the GP scoring functions. We compare GP-PSM with two methods including Support Vector Regression (SVR) and Random Forest (RF). The GP method along with RF and SVR, each is used for post-processing the results of peptide identification by PEAKS, a commonly used de novo sequencing method. The results show that GP-PSM outperforms RF and SVR and discriminates accurately between correct and incorrect PSMs. It correctly assigns peptides to 10% more spectra on an evaluation dataset containing 120 MS/MS spectra and decreases the false positive rate (FPR) of peptide identification.
机译:从头肽测序算法已被广泛用于proteolytics分析串联质谱(MS / MS),并将其分配到的肽,但质量控制的方法来评估从头肽测序落在后面的信心。一质量控制方法的基本组成部分是用于评估肽谱匹配(的PSM)的质量的评分函数。在此,我们提出了一种遗传编程(GP)的方法,称为GP-PSM,学习PSM评分函数用于改善的自信肽鉴定从MS / MS数据的速率。从数以千计的MS / MS谱的CP方法获悉。关于比赛的善良重要的特征是从学习集中提取,并纳入GP打分函数。我们比较GP-PSM有两个方法,包括支持向量回归(SVR)和随机森林(RF)。用RF和SVR沿着GP方法,每个用于通过PEAKS,通常使用的从头测序方法后处理的肽鉴定的结果。结果表明,GP-PSM性能优于射频和SVR,准确地正确和不正确的PSM区别对待。它正确地分配在含有120个MS / MS谱的评估数据集肽10%以上的光谱,并降低的肽鉴定假阳性率(FPR)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号