首页> 外文期刊>BMC Genomics >MUMAL: Multivariate analysis in shotgun proteomics using machine learning techniques
【24h】

MUMAL: Multivariate analysis in shotgun proteomics using machine learning techniques

机译:MUMAL:使用机器学习技术对shot弹枪蛋白质组学进行多变量分析

获取原文
获取外文期刊封面目录资料

摘要

BackgroundThe shotgun strategy (liquid chromatography coupled with tandem mass spectrometry) is widely applied for identification of proteins in complex mixtures. This method gives rise to thousands of spectra in a single run, which are interpreted by computational tools. Such tools normally use a protein database from which peptide sequences are extracted for matching with experimentally derived mass spectral data. After the database search, the correctness of obtained peptide-spectrum matches (PSMs) needs to be evaluated also by algorithms, as a manual curation of these huge datasets would be impractical. The target-decoy database strategy is largely used to perform spectrum evaluation. Nonetheless, this method has been applied without considering sensitivity, i.e., only error estimation is taken into account. A recently proposed method termed MUDE treats the target-decoy analysis as an optimization problem, where sensitivity is maximized. This method demonstrates a significant increase in the retrieved number of PSMs for a fixed error rate. However, the MUDE model is constructed in such a way that linear decision boundaries are established to separate correct from incorrect PSMs. Besides, the described heuristic for solving the optimization problem has to be executed many times to achieve a significant augmentation in sensitivity.ResultsHere, we propose a new method, termed MUMAL, for PSM assessment that is based on machine learning techniques. Our method can establish nonlinear decision boundaries, leading to a higher chance to retrieve more true positives. Furthermore, we need few iterations to achieve high sensitivities, strikingly shortening the running time of the whole process. Experiments show that our method achieves a considerably higher number of PSMs compared with standard tools such as MUDE, PeptideProphet, and typical target-decoy approaches.ConclusionOur approach not only enhances the computational performance, and thus the turn around time of MS-based experiments in proteomics, but also improves the information content with benefits of a higher proteome coverage. This improvement, for instance, increases the chance to identify important drug targets or biomarkers for drug development or molecular diagnostics.
机译:背景技术shot弹枪策略(液相色谱与串联质谱联用)被广泛用于鉴定复杂混合物中的蛋白质。该方法一次运行可产生数千个光谱,这些光谱可通过计算工具进行解释。此类工具通常使用蛋白质数据库,从中提取肽序列以与实验得出的质谱数据匹配。在数据库搜索之后,还需要通过算法来评估获得的肽谱匹配(PSM)的正确性,因为手动管理这些庞大的数据集是不切实际的。目标诱饵数据库策略主要用于执行频谱评估。尽管如此,该方法的应用没有考虑灵敏度,即仅考虑了误差估计。最近提出的称为MUDE的方法将目标诱饵分析视为一种优化问题,在该问题中,灵敏度得到了最大化。对于固定的错误率,此方法证明了检索到的PSM数量显着增加。但是,MUDE模型的构建方式是建立线性决策边界以将正确与错误PSM分开。此外,所描述的用于解决优化问题的启发式方法必须执行多次才能显着提高灵敏度。结果在此,我们提出了一种基于机器学习技术的PSM评估新方法MUMAL。我们的方法可以建立非线性决策边界,从而有更大的机会检索更多真实的正数。此外,我们只需进行几次迭代即可实现高灵敏度,从而显着缩短了整个过程的运行时间。实验表明,与MUDE,PeptideProphet和典型的目标诱饵方法等标准工具相比,我们的方法可实现更高数量的PSM。结论我们的方法不仅增强了计算性能,而且还提高了基于MS的实验的周转时间蛋白质组学,还可以利用更高的蛋白质组覆盖率来改善信息内容。例如,这种改进增加了识别重要药物靶标或生物标志物以进行药物开发或分子诊断的机会。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号