...
首页> 外文期刊>Journal of Bioinformatics and Computational Biology >Unsupervised multi-instance learning for protein structure determination
【24h】

Unsupervised multi-instance learning for protein structure determination

机译:无监督的蛋白质结构确定的多实例学习

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Many regions of the protein universe remain inaccessible by wet-laboratory or computational structure determination methods. A significant challenge in elucidating these dark regions in silico relates to the ability to discriminate relevant structure(s) among many structures/decoys computed for a protein of interest, a problem known as decoy selection. Clustering decoys based on geometric similarity remains popular. However, it is unclear how exactly to exploit the groups of decoys revealed via clustering to select individual structures for prediction. In this paper, we provide an intuitive formulation of the decoy selection problem as an instance of unsupervised multi-instance learning. We address the problem in three stages, first organizing given decoys of a protein molecule into bags, then identifying relevant bags, and finally drawing individual instances from these bags to offer as prediction. We propose both non-parametric and parametric algorithms for drawing individual instances. Our evaluation utilizes two datasets, one benchmark dataset of ensembles of decoys for a varied list of protein molecules, and a dataset of decoy ensembles for targets drawn from recent CASP competitions. A comparative analysis with state-of-the-art methods reveals that the proposed approach outperforms existing methods, thus warranting further investigation of multi-instance learning to advance our treatment of decoy selection.
机译:蛋白质宇宙的许多区域仍然无法通过湿实验室或计算结构测定方法进入。在硅片中阐明这些暗区的一个重大挑战是在为感兴趣的蛋白质计算的许多结构/诱饵中辨别相关结构的能力,这一问题称为诱饵选择。基于几何相似性的诱饵聚类仍然很流行。然而,目前尚不清楚如何准确地利用通过聚类发现的诱饵组来选择用于预测的单个结构。在本文中,我们提供了一个直观的公式,诱饵选择问题作为一个例子的无监督多实例学习。我们分三个阶段解决这个问题,首先将给定的蛋白质分子诱饵组织成袋,然后识别相关袋,最后从这些袋中提取单个实例作为预测。我们提出了绘制单个实例的非参数和参数算法。我们的评估利用了两个数据集,一个是各种蛋白质分子的诱饵集合基准数据集,另一个是从最近的CASP竞赛中提取的目标诱饵集合数据集。与最新方法的对比分析表明,该方法优于现有方法,因此有必要进一步研究多实例学习,以推进我们对诱饵选择的处理。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号