...
首页> 外文期刊>Bioinformatics >PREDICT-2ND: a tool for generalized protein local structure prediction
【24h】

PREDICT-2ND: a tool for generalized protein local structure prediction

机译:PREDICT-2ND:预测蛋白质局部结构的工具

获取原文
获取原文并翻译 | 示例
           

摘要

MOTIVATION: Predictions of protein local structure, derived from sequence alignment information alone, provide visualization tools for biologists to evaluate the importance of amino acid residue positions of interest in the absence of X-ray crystal/NMR structures or homology models. They are also useful as inputs to sequence analysis and modeling tools, such as hidden Markov models (HMMs), which can be used to search for homology in databases of known protein structure. In addition, local structure predictions can be used as a component of cost functions in genetic algorithms that predict protein tertiary structure. We have developed a program (predict-2nd) that trains multilayer neural networks and have applied it to numerous local structure alphabets, tuning network parameters such as the number of layers, the number of units in each layer and the window sizes of each layer. We have had the most success with four-layer networks, with gradually increasing window sizes at each layer. RESULTS: Because the four-layer neural nets occasionally get trapped in poor local optima, our training protocol now uses many different random starts, with short training runs, followed by more training on the best performing networks from the short runs. One recent addition to the program is the option to add a guide sequence to the profile inputs, increasing the number of inputs per position by 20. We find that use of a guide sequence provides a small but consistent improvement in the predictions for several different local-structure alphabets. AVAILABILITY: Local structure prediction with the methods described here is available for use online at http://www.soe.ucsc.edu/compbio/SAM_T08/T08-query.html. The source code and example networks for PREDICT-2ND are available at http://www.soe.ucsc.edu/~karplus/predict-2nd/ A required C++ library is available at http://www.soe.ucsc.edu/~karplus/ultimate/
机译:动机:仅从序列比对信息中得出的蛋白质局部结构的预测,为生物学家提供了可视化工具,以评估在没有X射线晶体/ NMR结构或同源性模型的情况下感兴趣的氨基酸残基位置的重要性。它们还可用作序列分析和建模工具的输入,例如隐马尔可夫模型(HMM),可用于在已知蛋白质结构的数据库中搜索同源性。此外,局部结构预测可以用作预测蛋白质三级结构的遗传算法中成本函数的组成部分。我们开发了一个训练多层神经网络的程序(predict-2nd),并将其应用于众多局部结构字母,调整网络参数,例如层数,每一层中的单元数以及每一层的窗口大小。我们在四层网络方面取得了最大的成功,并且逐步增加了每一层的窗口大小。结果:由于四层神经网络偶尔会陷入较差的局部最优状态,因此我们的训练协议现在使用许多不同的随机开始,训练时间短,随后从短期开始对性能最好的网络进行更多训练。该程序的一项最新功能是可以向配置文件输入中添加一个引导序列,从而将每个位置的输入数量增加20个。我们发现,使用引导序列可以对几个不同的局部变量提供较小但一致的改进结构字母。可用性:使用此处描述的方法进行局部结构预测可从http://www.soe.ucsc.edu/compbio/SAM_T08/T08-query.html在线使用。可在http://www.soe.ucsc.edu/~karplus/predict-2nd/获得PREDICT-2ND的源代码和示例网络。在http://www.soe.ucsc.edu可以找到所需的C ++库。 /〜karplus / ultimate /

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号