...
首页> 外文期刊>Bioinformatics >Redundancy-weighting the PDB for detailed secondary structure prediction using deep-learning models
【24h】

Redundancy-weighting the PDB for detailed secondary structure prediction using deep-learning models

机译:使用深学习模型进行冗余加权PDB以进行详细的二级结构预测

获取原文
获取原文并翻译 | 示例
           

摘要

Motivation: The Protein Data Bank (PDB), the ultimate source for data in structural biology, is inherently imbalanced. To alleviate biases, virtually all structural biology studies use nonredundant (NR) subsets of the PDB, which include only a fraction of the available data. An alternative approach, dubbed redundancy-weighting (RW), down-weights redundant entries rather than discarding them. This approach may be particularly helpful for machine-learning (ML) methods that use the PDB as their source for data. Methods for secondary structure prediction (SSP) have greatly improved over the years with recent studies achieving above 70% accuracy for eight-class (DSSP) prediction. As these methods typically incorporate ML techniques, training on RW datasets might improve accuracy, as well as pave the way toward larger and more informative secondary structure classes.
机译:动机:蛋白质数据库(PDB)是结构生物学中数据的最终来源,本身是不平衡的。 为了减轻偏差,几乎所有结构生物学研究都使用PDB的非冗余(NR)子集,其仅包括可用数据的一小部分。 替代方法,被称为冗余加权(RW),减速冗余条目而不是丢弃它们。 这种方法可能特别有助于使用PDB作为数据源的机器学习(ML)方法。 二次结构预测方法(SSP)多年来大大提高,最近的研究实现了八级(DSSP)预测的70%的精度高于70%。 由于这些方法通常包含ML技术,因此对RW数据集的训练可能提高精度,以及朝向更大和更具信息丰富的二级结构类的方式铺平道路。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号