首页> 美国卫生研究院文献>other >An Improved Algorithm and Web Application for Predicting Co-Complexed Proteins from Affinity Purification – Mass Spectrometry Data
【2h】

An Improved Algorithm and Web Application for Predicting Co-Complexed Proteins from Affinity Purification – Mass Spectrometry Data

机译:改进的算法和Web应用程序可从亲和纯化中预测复杂的蛋白质–质谱数据

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Protein-protein interactions defined by affinity purification and mass spectrometry (APMS) approaches suffer from high false discovery rates. Consequently, the candidate interaction lists must be pruned of contaminants before network construction and interpretation, historically an expensive and time-intensive task. In recent years, numerous computational methods have been developed to identify genuine interactions from hundreds revealed by APMS experiments. Here, comparative analysis of several popular algorithms revealed complementarity in their classification accuracies, which is supported by their divergent scoring strategies. As such, we used two accurate and computationally efficient methods as features for machine learning using the Random Forest algorithm. Additionally, we developed novel mathematical models to include a variety of indirect data, such as mRNA co-expression, gene ontologies and homologous protein interactions as features within the classification problem. We show that our method, which we call Spotlite, outperforms existing methods on four diverse and public APMS datasets. Because implementation of existing APMS scoring methods requires computational expertise beyond many laboratories, we created a user-friendly and fast web application for APMS data scoring, analysis, annotation and network visualization, for use on new and existing data (). The utility of Spotlite and its visualization platform for revealing physical, functional and disease-relevant characteristics within APMS data is established through a focused analysis of the KEAP1 E3 ubiquitin ligase.
机译:通过亲和纯化和质谱(APMS)方法定义的蛋白质-蛋白质相互作用遭受高错误发现率的困扰。因此,在构建和解释网络之前,必须删除候选交互列表中的污染物,这在历史上是一项昂贵且耗时的任务。近年来,已经开发出许多计算方法来从APMS实验揭示的数百种中识别出真正的相互作用。在这里,对几种流行算法的比较分析揭示了它们在分类准确性上的互补性,这得到了它们不同的评分策略的支持。因此,我们使用两种准确且计算效率高的方法作为使用随机森林算法进行机器学习的功能。此外,我们开发了新颖的数学模型,以包括各种间接数据,例如mRNA共表达,基因本体论和同源蛋白质相互作用等作为分类问题中的特征。我们证明了我们的方法(称为Spotlite)在四个不同的公共APMS数据集上的性能优于现有方法。由于实施现有APMS评分方法需要许多实验室的计算专业知识,因此我们为APMS数据评分,分析,注释和网络可视化创建了一个用户友好且快速的Web应用程序,可用于新数据和现有数据()。通过重点分析KEAP1 E3泛素连接酶,建立了Spotlite及其可视化平台(用于揭示APMS数据中的物理,功能和疾病相关特征)的实用程序。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号