首页> 外文期刊>Journal of Computer-Aided Molecular Design >Individually double minimum-distance definition of protein-RNA binding residues and application to structure-based prediction
【24h】

Individually double minimum-distance definition of protein-RNA binding residues and application to structure-based prediction

机译:蛋白质RNA结合残留物的单独双重最小距离定义及其在基于结构的预测中的应用

获取原文
获取原文并翻译 | 示例
           

摘要

Identifying protein-RNA binding residues is essential for understanding the mechanism of protein-RNA interactions. So far, rigid distance thresholds are commonly used to define protein-RNA binding residues. However, after investigating 182 non-redundant protein-RNA complexes, we find that it would be unsuitable for a certain amount of complexes since the distances between proteins and RNAs vary widely. In this work, a novel definition method was proposed based on a flexible distance cutoff. This method can fully consider the individual differences among complexes by setting a variable tolerance limit of protein-RNA interactions, i.e. the double minimum-distance by which different distance thresholds are achieved for different complexes. In order to validate our method, a comprehensive comparison between our flexible method and traditional rigid methods was implemented in terms of interface structure, amino acid composition, interface area and interaction force, etc. The results indicate that this method is more reasonable because it incorporates the specificity of different complexes by extracting the important residues lost by rigid distance methods and discarding some redundant residues. Finally, to further test our double minimum-distance definition strategy, we developed a classifier to predict those binding sites derived from our new method by using structural features and a random forest machine learning algorithm. The model achieved a satisfactory prediction performance and the accuracy on independent data sets reaches to 85.0%. To the best of our knowledge, it is the first prediction model to define positive and negative samples using a flexible cutoff. So the comparison analysis and modeling results have demonstrated that our method would be a very promising strategy for more precisely defining protein-RNA binding sites.
机译:鉴定蛋白质RNA结合残基对于理解蛋白质-RNA相互作用的机制是必不可少的。到目前为止,刚性距离阈值通常用于定义蛋白质RNA结合残基。然而,在研究182个非冗余蛋白质RNA复合物之后,发现由于蛋白质和RNA之间的距离广泛变化,因此在一定量的复合物中可能是不适合的。在这项工作中,提出了一种基于灵活距离截止的新型定义方法。该方法可以通过设定蛋白质RNA相互作用的可变耐受性限制,充分考虑复合物之间的个体差异,即对不同复合物实现不同距离阈值的双重最小距离。为了验证我们的方法,在界面结构,氨基酸组成,界面面积和相互作用等方面实现了我们的灵活方法和传统刚性方法之间的全面比较。结果表明该方法更加合理,因为它包含通过提取刚性距离方法损失的重要残留物并丢弃一些冗余残留物来表达不同复合物的特异性。最后,为了进一步测试我们的双重最小距离定义策略,我们开发了一种分类器,以通过使用结构特征和随机林机学习算法预测从我们的新方法衍生的那些绑定站点。该模型实现了令人满意的预测性能,独立数据集的准确性达到85.0%。据我们所知,它是第一个预测模型,用于使用柔性截止来定义正面和阴性样品。因此,比较分析和建模结果表明,我们的方法将是一种非常有前途的策略,用于更精确地定义蛋白质RNA结合位点。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号