...
首页> 外文期刊>Journal of proteome research >Re-Fraction: A machine learning approach for deterministic identification of protein homologues and splice variants in large-scale MS-based proteomics
【24h】

Re-Fraction: A machine learning approach for deterministic identification of protein homologues and splice variants in large-scale MS-based proteomics

机译:Re-Fraction:一种机器学习方法,用于确定性鉴定大规模基于MS的蛋白质组学中的蛋白质同源物和剪接变体

获取原文
获取原文并翻译 | 示例
           

摘要

A key step in the analysis of mass spectrometry (MS)-based proteomics data is the inference of proteins from identified peptide sequences. Here we describe Re-Fraction, a novel machine learning algorithm that enhances deterministic protein identification. Re-Fraction utilizes several protein physical properties to assign proteins to expected protein fractions that comprise large-scale MS-based proteomics data. This information is then used to appropriately assign peptides to specific proteins. This approach is sensitive, highly specific, and computationally efficient. We provide algorithms and source code for the current version of Re-Fraction, which accepts output tables from the MaxQuant environment. Nevertheless, the principles behind Re-Fraction can be applied to other protein identification pipelines where data are generated from samples fractionated at the protein level. We demonstrate the utility of this approach through reanalysis of data from a previously published study and generate lists of proteins deterministically identified by Re-Fraction that were previously only identified as members of a protein group. We find that this approach is particularly useful in resolving protein groups composed of splice variants and homologues, which are frequently expressed in a cell- or tissue-specific manner and may have important biological consequences.
机译:基于质谱(MS)的蛋白质组学数据分析中的关键步骤是从已鉴定的肽序列推断蛋白质。在这里,我们描述Re-Fraction,这是一种新颖的机器学习算法,可增强确定性蛋白质的识别。 Re-Fraction利用多种蛋白质物理特性将蛋白质分配给包含大规模基于MS的蛋白质组学数据的预期蛋白质馏分。然后,此信息用于将肽适当分配给特定蛋白质。这种方法是敏感的,高度特定的并且计算效率高。我们为Re-Fraction的当前版本提供算法和源代码,该版本接受来自MaxQuant环境的输出表。尽管如此,“再分馏”背后的原理仍可应用于其他蛋白质鉴定管道,在这些蛋白质鉴定管道中,数据是从蛋白质级分的样品中产生的。我们通过对先前发表的研究中的数据进行重新分析来证明这种方法的实用性,并生成由Re-Fraction确定性鉴定的蛋白质列表,这些列表以前仅被鉴定为蛋白质组的成员。我们发现这种方法在解析由剪接变体和同源物组成的蛋白质组中特别有用,这些剪接变体和同源物通常以细胞或组织特异性的方式表达,并可能产生重要的生物学后果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号