...
首页> 外文期刊>BMC Bioinformatics >3D deep convolutional neural networks for amino acid environment similarity analysis
【24h】

3D deep convolutional neural networks for amino acid environment similarity analysis

机译:用于氨基酸环境相似性分析的3D深层卷积神经网络

获取原文

摘要

Background Central to protein biology is the understanding of how structural elements give rise to observed function. The surfeit of protein structural data enables development of computational methods to systematically derive rules governing structural-functional relationships. However, performance of these methods depends critically on the choice of protein structural representation. Most current methods rely on features that are manually selected based on knowledge about protein structures. These are often general-purpose but not optimized for the specific application of interest. In this paper, we present a general framework that applies 3D convolutional neural network (3DCNN) technology to structure-based protein analysis. The framework automatically extracts task-specific features from the raw atom distribution, driven by supervised labels. As a pilot study, we use our network to analyze local protein microenvironments surrounding the 20 amino acids, and predict the amino acids most compatible with environments within a protein structure. To further validate the power of our method, we construct two amino acid substitution matrices from the prediction statistics and use them to predict effects of mutations in T4 lysozyme structures. Results Our deep 3DCNN achieves a two-fold increase in prediction accuracy compared to models that employ conventional hand-engineered features and successfully recapitulates known information about similar and different microenvironments. Models built from our predictions and substitution matrices achieve an 85% accuracy predicting outcomes of the T4 lysozyme mutation variants. Our substitution matrices contain rich information relevant to mutation analysis compared to well-established substitution matrices. Finally, we present a visualization method to inspect the individual contributions of each atom to the classification decisions. Conclusions End-to-end trained deep learning networks consistently outperform methods using hand-engineered features, suggesting that the 3DCNN framework is well suited for analysis of protein microenvironments and may be useful for other protein structural analyses.
机译:背景技术蛋白质生物学的中心是对结构元素如何产生所观察到的功能的理解。蛋白质结构数据的丰富使计算方法的开发得以系统地推导控制结构-功能关系的规则。但是,这些方法的性能关键取决于蛋白质结构表示的选择。当前大多数方法依赖于根据有关蛋白质结构的知识手动选择的功能。这些通常是通用的,但并未针对感兴趣的特定应用进行优化。在本文中,我们提出了将3D卷积神经网络(3DCNN)技术应用于基于结构的蛋白质分析的通用框架。该框架在监督标签的驱动下自动从原始原子分布中提取特定于任务的功能。作为一项初步研究,我们使用我们的网络来分析围绕20个氨基酸的局部蛋白质微环境,并预测与蛋白质结构内的环境最相容的氨基酸。为了进一步验证我们方法的功效,我们从预测统计数据中构建了两个氨基酸替换矩阵,并将它们用于预测T4溶菌酶结构中突变的影响。结果与采用传统手工设计功能并成功概括有关相似和不同微环境的已知信息的模型相比,我们的3DCNN深度预测模型可将预测精度提高两倍。根据我们的预测和替代矩阵建立的模型可预测T4溶菌酶突变变体的结果,准确性达到85%。与完善的替代矩阵相比,我们的替代矩阵包含与突变分析相关的丰富信息。最后,我们提出了一种可视化方法来检查每个原子对分类决策的贡献。结论端到端训练有素的深度学习网络始终优于使用手工设计的功能的方法,这表明3DCNN框架非常适合蛋白质微环境的分析,并且可能对其他蛋白质结构分析有用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号