首页> 美国卫生研究院文献>Database: The Journal of Biological Databases and Curation >ASAP: a machine learning framework for local protein properties
【2h】

ASAP: a machine learning framework for local protein properties

机译:尽快:针对局部蛋白质特性的机器学习框架

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Determining residue-level protein properties, such as sites of post-translational modifications (PTMs), is vital to understanding protein function. Experimental methods are costly and time-consuming, while traditional rule-based computational methods fail to annotate sites lacking substantial similarity. Machine Learning (ML) methods are becoming fundamental in annotating unknown proteins and their heterogeneous properties. We present ASAP (Amino-acid Sequence Annotation Prediction), a universal ML framework for predicting residue-level properties. ASAP extracts numerous features from raw sequences, and supports easy integration of external features such as secondary structure, solvent accessibility, intrinsically disorder or PSSM profiles. Features are then used to train ML classifiers. ASAP can create new classifiers within minutes for a variety of tasks, including PTM prediction (e.g. cleavage sites by convertase, phosphoserine modification). We present a detailed case study for ASAP: CleavePred, an ASAP-based model to predict protein precursor cleavage sites, with state-of-the-art results. Protein cleavage is a PTM shared by a wide variety of proteins sharing minimal sequence similarity. Current rule-based methods suffer from high false positive rates, making them suboptimal. The high performance of CleavePred makes it suitable for analyzing new proteomes at a genomic scale. The tool is attractive to protein design, mass spectrometry search engines and the discovery of new bioactive peptides from precursors. ASAP functions as a baseline approach for residue-level protein sequence prediction. CleavePred is freely accessible as a web-based application. Both ASAP and CleavePred are open-source with a flexible Python API.>Database URL: ASAP’s and CleavePred source code, webtool and tutorials are available at: ; .
机译:确定残基水平的蛋白质特性,例如翻译后修饰(PTM)的位点,对于理解蛋白质功能至关重要。实验方法既昂贵又费时,而传统的基于规则的计算方法无法注释缺乏实质相似性的站点。机器学习(ML)方法正成为注释未知蛋白质及其异质性质的基础。我们提出了ASAP(氨基酸序列注释预测),一种用于预测残基水平特性的通用ML框架。 ASAP从原始序列中提取大量特征,并支持轻松整合外部特征,例如二级结构,溶剂可及性,固有无序或PSSM谱。然后使用功能来训练ML分类器。 ASAP可以在几分钟内为各种任务创建新的分类器,包括PTM预测(例如,转化酶切割的位点,磷酸丝氨酸修饰)。我们提供了一个有关ASAP的详细案例研究:CleavePred,这是一个基于ASAP的模型,可以预测蛋白质前体的裂解位点,并提供最新的结果。蛋白质切割是由共享最小序列相似性的多种蛋白质共享的PTM。当前基于规则的方法具有很高的误报率,使其处于次优状态。 CleavePred的高性能使其适合在基因组规模上分析新的蛋白质组。该工具对蛋白质设计,质谱搜索引擎以及从前体中发现新的生物活性肽具有吸引力。 ASAP作为残留水平蛋白质序列预测的基线方法。 CleavePred可作为基于Web的应用程序免费访问。 ASAP和CleavePred都是开源的,带有灵活的Python API。>数据库URL: ASAP和CleavePred的源代码,网络工具和教程可在以下位置找到:; 。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号