首页> 美国卫生研究院文献>Journal of Applied Crystallography >Logistic regression models to predict solvent accessible residues using sequence- and homology-based qualitative and quantitative descriptors applied to a domain-complete X-ray structure learning set

【2h】

Logistic regression models to predict solvent accessible residues using sequence- and homology-based qualitative and quantitative descriptors applied to a domain-complete X-ray structure learning set

机译：使用基于序列和同源性的定性和定量描述符对域完整的X射线结构学习集进行逻辑回归模型预测溶剂可及残基

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

A working example of relative solvent accessibility (RSA) prediction for proteins is presented. Novel logistic regression models with various qualitative descriptors that include amino acid type and quantitative descriptors that include 20- and six-term sequence entropy have been built and validated. A domain-complete learning set of over 1300 proteins is used to fit initial models with various sequence homology descriptors as well as query residue qualitative descriptors. Homology descriptors are derived from BLASTp sequence alignments, whereas the RSA values are determined directly from the crystal structure. The logistic regression models are fitted using dichotomous responses indicating buried or accessible solvent, with binary classifications obtained from the RSA values. The fitted models determine binary predictions of residue solvent accessibility with accuracies comparable to other less computationally intensive methods using the standard RSA threshold criteria 20 and 25% as solvent accessible. When an additional non-homology descriptor describing Lobanov–Galzitskaya residue disorder propensity is included, incremental improvements in accuracy are achieved with 25% threshold accuracies of 76.12 and 74.79% for the Manesh-215 and CASP(8+9) test sets, respectively. Moreover, the described software and the accompanying learning and validation sets allow students and researchers to explore the utility of RSA prediction with simple, physically intuitive models in any number of related applications.

机译：给出了蛋白质相对溶剂可及性（RSA）预测的工作示例。建立并验证了具有各种定性描述符（包括氨基酸类型）和定量描述符（包括20和6项序列熵）的新型逻辑回归模型。超过1300种蛋白质的域完整学习集用于拟合具有各种序列同源性描述符以及查询残基定性描述符的初始模型。同源性描述符来自BLASTp序列比对，而RSA值直接从晶体结构确定。使用指示掩埋或可及溶剂的二分响应拟合Logistic回归模型，并根据RSA值获得二元分类。拟合模型使用标准RSA阈值标准20％和25％作为溶剂可及性，确定了残留溶剂可及性的二进制预测，其准确性可与其他计算量较少的方法相媲美。当包含描述Lobanov–Galzitskaya残基疾病倾向的其他非同源性描述符时，Manesh-215和CASP（8 + 9）测试集的25％阈值准确度分别为76.12和74.79％，从而实现了准确性的逐步提高。此外，所描述的软件以及随附的学习和验证集使学生和研究人员可以在许多相关应用中使用简单，直观的模型来探索RSA预测的效用。

著录项

期刊名称 Journal of Applied Crystallography
作者
Reecha Nepal; Joanna Spencer; Guneet Bhogal; Amulya Nedunuri; Thomas Poelman; Thejas Kamath; Edwin Chung; Katherine Kantardjieff; Andrea Gottlieb; Brooke Lustig;
展开▼
作者单位

展开▼
年(卷),期 -1(48),Pt 6
年度 -1
页码 1976–1984
总页数 9
原文格式 PDF
正文语种
中图分类生化遗传学;生化药理学;
关键词
relative solvent accessibility logistic regression Lobanov–Galzitskaya descriptor;

机译：相对溶剂可及性;逻辑回归;Lobanov–Galzitskaya描述符;

相似文献

外文文献
中文文献
专利

1. Logistic regression models to predict solvent accessible residues using sequence- and homology-based qualitative and quantitative descriptors applied to a domain-complete X-ray structure learning set [J] . Nepal Reecha, Spencer Joanna, Bhogal Guneet, Journal of Applied Crystallography . 2015,第6期

机译：使用基于序列和同源性的定性和定量描述符将Logistic回归模型预测为溶剂可及的残基，并将其应用于领域完整的X射线结构学习集
2. MSACompro: protein multiple sequence alignment using predicted secondary structure, solvent accessibility, and residue-residue contacts [J] . Xin Deng, Jianlin Cheng BMC Bioinformatics . 2011,第1期

机译：MSACOMPRO：蛋白质多序列比对使用预测的二级结构，溶剂可接近性和残留物 - 残留物触点
3. Effect Modeling of Count Data Using Logistic Regression with Qualitative Predictors [J] . Haeil Ahn Engineering . 2014,第12期

机译：使用定性预测因子的Logistic回归对计数数据进行效果建模
4. Novel Application of Query-Based Qualitative Predictors for Characterization of Solvent Accessible Residues in Conjunction with Protein Sequence Homology [C] . Daniel A. Rose, Reecha Nepal, Radhika Mishra, International Workshop on Database and Expert Systems Applications . 2011

机译：基于查询的定性预测因子与蛋白质序列同源性结合溶剂可偏转残留物的新型应用
5. Novel application of query-based qualitative predictors for characterization of solvent-accessible residues in conjunction with protein sequence homology. [D] . Nepal, Reecha. 2013

机译：基于查询的定性预测变量在结合蛋白质序列同源性表征溶剂可及残基方面的新应用。
6. MSACompro: protein multiple sequence alignment using predicted secondary structure solvent accessibility and residue-residue contacts [O] . Xin Deng, Jianlin Cheng 2011

机译：MSACompro：使用预测的二级结构溶剂可及性和残基-残基接触进行蛋白质多序列比对
7. Logistic regression models to predict solvent accessible residues using sequence- and homology-based qualitative and quantitative descriptors applied to a domain-complete X-ray structure learning set [O] . Nepal, Reecha, Spencer, Joanna, Bhogal, Guneet, 2015

机译：使用基于序列和同源性的定性和定量描述符对域完整的X射线结构学习集进行逻辑回归模型预测溶剂可及残基

Logistic regression models to predict solvent accessible residues using sequence- and homology-based qualitative and quantitative descriptors applied to a domain-complete X-ray structure learning set

摘要

著录项

相似文献

相关主题

期刊订阅