首页> 外文期刊>IEEE/ACM transactions on computational biology and bioinformatics >Protein Inference from the Integration of Tandem MS Data and Interactome Networks
【24h】

Protein Inference from the Integration of Tandem MS Data and Interactome Networks

机译:通过串联MS数据和Interactome网络的整合进行蛋白质推断

获取原文
获取原文并翻译 | 示例
       

摘要

Since proteins are digested into a mixture of peptides in the preprocessing step of tandem mass spectrometry (MS), it is difficult to determine which specific protein a shared peptide belongs to. In recent studies, besides tandem MS data and peptide identification information, some other information is exploited to infer proteins. Different from the methods which first use only tandem MS data to infer proteins and then use network information to refine them, this study proposes a protein inference method named TMSIN, which uses interactome networks directly. As two interacting proteins should co-exist, it is reasonable to assume that if one of the interacting proteins is confidently inferred in a sample, its interacting partners should have a high probability in the same sample, too. Therefore, we can use the neighborhood information of a protein in an interactome network to adjust the probability that the shared peptide belongs to the protein. In TMSIN, a multi-weighted graph is constructed by incorporating the bipartite graph with interactome network information, where the bipartite graph is built with the peptide identification information. Based on multi-weighted graphs, TMSIN adopts an iterative workflow to infer proteins. At each iterative step, the probability that a shared peptide belongs to a specific protein is calculated by using the Bayes' law based on the neighbor protein support scores of each protein which are mapped by the shared peptides. We carried out experiments on yeast data and human data to evaluate the performance of TMSIN in terms of ROC, q-value, and accuracy. The experimental results show that AUC scores yielded by TMSIN are 0.742 and 0.874 in yeast dataset and human dataset, respectively, and TMSIN yields the maximum number of true positives when q-value less than or equal to 0.05. The overlap analysis shows that TMSIN is an effective complementary approach for protein inference.
机译:由于蛋白质在串联质谱(MS)的预处理步骤中被消化成肽混合物,因此很难确定共享肽属于哪种特定蛋白质。在最近的研究中,除了串联MS数据和肽鉴定信息外,还利用其他一些信息来推断蛋白质。与仅使用串联MS数据推断蛋白质然后使用网络信息进行精炼的方法不同,本研究提出了一种名为TMSIN的蛋白质推断方法,该方法直接使用交互组网络。由于两种相互作用的蛋白质应共存,因此可以合理地假设,如果一个相互作用的蛋白质之一在样品中被可靠地推断出来,那么其相互作用的伴侣在同一样品中也应具有很高的概率。因此,我们可以在相互作用组网络中使用蛋白质的邻域信息来调整共享肽属于蛋白质的可能性。在TMSIN中,通过将二分图与交互组网络信息合并来构建多权图,其中,二分图是用肽段识别信息构建的。基于多重加权图,TMSIN采用迭代工作流程来推断蛋白质。在每个迭代步骤中,基于由共享肽映射的每种蛋白的邻近蛋白支持评分,使用贝叶斯定律计算共享肽属于特定蛋白的概率。我们对酵母数据和人类数据进行了实验,以评估ROMS,q值和准确性方面的TMSIN性能。实验结果表明,在酵母数据集和人类数据集中,TMSIN产生的AUC分数分别为0.742和0.874,当q值小于或等于0.05时,TMSIN产生最大的真实阳性数。重叠分析表明,TMSIN是蛋白质推断的有效补充方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号