...
首页> 外文期刊>Journal of Computer-Aided Molecular Design >H-DROP: an SVM based helical domain linker predictor trained with features optimized by combining random forest and stepwise selection
【24h】

H-DROP: an SVM based helical domain linker predictor trained with features optimized by combining random forest and stepwise selection

机译:H-DROP:一种基于SVM的螺旋域链接器预测器,其训练特征通过结合随机森林和逐步选择进行了优化

获取原文
获取原文并翻译 | 示例

摘要

Domain linker prediction is attracting much interest as it can help identifying novel domains suitable for high throughput proteomics analysis. Here, we report H-DROP, an SVM-based Helical Domain linker pRediction using OPtimal features. H-DROP is, to the best of our knowledge, the first predictor for specifically and effectively identifying helical linkers. This was made possible first because a large training dataset became available from IS-Dom, and second because we selected a small number of optimal features from a huge number of potential ones. The training helical linker dataset, which included 261 helical linkers, was constructed by detecting helical residues at the boundary regions of two independent structural domains listed in our previously reported IS-Dom dataset. 45 optimal feature candidates were selected from 3,000 features by random forest, which were further reduced to 26 optimal features by stepwise selection. The prediction sensitivity and precision of H-DROP were 35.2 and 38.8 %, respectively. These values were over 10.7 % higher than those of control methods including our previously developed DROP, which is a coil linker predictor, and PPRODO, which is trained with un-differentiated domain boundary sequences. Overall, these results indicated that helical linkers can be predicted from sequence information alone by using a strictly curated training data set for helical linkers and carefully selected set of optimal features. H-DROP is available at http://domserv.lab.tuat.ac.jp
机译:域链接器预测吸引了很多兴趣,因为它可以帮助识别适用于高通量蛋白质组学分析的新颖域。在这里,我们报告H-DROP,这是一种使用OPtimal功能的基于SVM的螺旋域链接程序pRediction。据我们所知,H-DROP是用于特异性和有效识别螺旋接头的第一个预测因子。首先使之成为可能是因为可以从IS-Dom获得大量的训练数据集,其次是因为我们从大量的潜在特征中选择了少量的最佳特征。通过检测先前报道的IS-Dom数据集中列出的两个独立结构域的边界区域的螺旋残基,构建了包含261个螺旋接头的训练螺旋接头数据集。通过随机森林从3,000个特征中选择了45个最佳特征,然后通过逐步选择将其进一步缩减为26个最佳特征。 H-DROP的预测灵敏度和精度分别为35.2%和38.8%。这些值比控制方法(包括我们先前开发的作为线圈连接子预测子的DROP和使用未区分域边界序列训练的PPRODO)的控制方法高10.7%以上。总体而言,这些结果表明,可以通过使用严格挑选的螺旋连接子训练数据集和精心选择的最佳特征集,单独从序列信息中预测螺旋连接子。 H-DROP可从http://domserv.lab.tuat.ac.jp获得

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号