Unsupervised multi-instance learning for protein structure determination

Alam Fardina Fathmiul; Shehu Amarda

首页> 外文期刊>Journal of Bioinformatics and Computational Biology >Unsupervised multi-instance learning for protein structure determination

【24h】

Unsupervised multi-instance learning for protein structure determination

机译：无监督的蛋白质结构确定的多实例学习

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Many regions of the protein universe remain inaccessible by wet-laboratory or computational structure determination methods. A significant challenge in elucidating these dark regions in silico relates to the ability to discriminate relevant structure(s) among many structures/decoys computed for a protein of interest, a problem known as decoy selection. Clustering decoys based on geometric similarity remains popular. However, it is unclear how exactly to exploit the groups of decoys revealed via clustering to select individual structures for prediction. In this paper, we provide an intuitive formulation of the decoy selection problem as an instance of unsupervised multi-instance learning. We address the problem in three stages, first organizing given decoys of a protein molecule into bags, then identifying relevant bags, and finally drawing individual instances from these bags to offer as prediction. We propose both non-parametric and parametric algorithms for drawing individual instances. Our evaluation utilizes two datasets, one benchmark dataset of ensembles of decoys for a varied list of protein molecules, and a dataset of decoy ensembles for targets drawn from recent CASP competitions. A comparative analysis with state-of-the-art methods reveals that the proposed approach outperforms existing methods, thus warranting further investigation of multi-instance learning to advance our treatment of decoy selection.

机译：蛋白质宇宙的许多区域仍然无法通过湿实验室或计算结构测定方法进入。在硅片中阐明这些暗区的一个重大挑战是在为感兴趣的蛋白质计算的许多结构/诱饵中辨别相关结构的能力，这一问题称为诱饵选择。基于几何相似性的诱饵聚类仍然很流行。然而，目前尚不清楚如何准确地利用通过聚类发现的诱饵组来选择用于预测的单个结构。在本文中，我们提供了一个直观的公式，诱饵选择问题作为一个例子的无监督多实例学习。我们分三个阶段解决这个问题，首先将给定的蛋白质分子诱饵组织成袋，然后识别相关袋，最后从这些袋中提取单个实例作为预测。我们提出了绘制单个实例的非参数和参数算法。我们的评估利用了两个数据集，一个是各种蛋白质分子的诱饵集合基准数据集，另一个是从最近的CASP竞赛中提取的目标诱饵集合数据集。与最新方法的对比分析表明，该方法优于现有方法，因此有必要进一步研究多实例学习，以推进我们对诱饵选择的处理。

著录项

来源
《Journal of Bioinformatics and Computational Biology》 |2021年第1期|共20页
作者
Alam Fardina Fathmiul; Shehu Amarda;
展开▼
作者单位

George Mason Univ Dept Comp Sci Fairfax VA 22030 USA;

George Mason Univ Dept Comp Sci Fairfax VA 22030 USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类细胞生物学;
关键词
Protein structure determination; decoy selection; decoy quality; multi-instance learning; unsupervised learning;

机译：蛋白质结构决定;诱饵选择;诱饵质量;多实例学习;无监督的学习;

相似文献

外文文献
中文文献
专利

1. Using Xe as a heavy atom for phase determination of protein trichosanthin structure [J] . CHEN Ya-Xing, LI Min-Jun, YU Feng, 核技术（英文版） . 2014,第003期
2. Using unsupervised learning methods for enhancing protein structure insight [J] . Mihai Teletin, Gabriela Czibula, Silvana Albert, Procedia Computer Science . 2018,第22期

机译：使用无监督学习方法来增强蛋白质结构洞察力
3. Sequence-structure relationship study in all-alpha transmembrane proteins using an unsupervised learning approach [J] . Esque Jeremy, Urbain Aurelie, Etchebest Catherine, Amino acids . 2015,第11期

机译：使用无监督学习方法研究全α跨膜蛋白的序列-结构关系
4. CASD-NMR 2: robust and accurate unsupervised analysis of raw NOESY spectra and protein structure determination with UNIO [J] . Guerry Paul, Viet Dung Duong, Herrmann Torsten Journal of Biomolecular NMR . 2015,第4期

机译：CASD-NMR 2：鲁棒和准确的未经监督分析原始斑块和UIO的蛋白质结构测定
5. From Unsupervised Multi-Instance Learning to Identification of Near-Native Protein Structures [C] . Fardina Fathmiul Alam, Amarda Shehu International Conference on Bioinformatics and Computational Biology . 2020

机译：从无监督的多实例学习识别近乎母蛋白结构
6. Unsupervised Learning of Latent Structure from Linear and Nonlinear Measurements [D] . Yang, Bo. 2019

机译：从线性和非线性测量的潜在结构的无监督学习
7. Unsupervised determination of protein crystal structures [O] . Ivan S. Ufimtsev, Michael Levitt 2019

机译：蛋白质晶体结构的无监督测定
8. From Unsupervised Multi-Instance Learning to Identification of Near-Native Protein Structures [O] . Fardina Alam, Amarda Shehu -1

机译：从无监督的多实例学习识别近乎母蛋白结构
9. Unsupervised Classification Learning from Cross-Modal Environmental Structure [R] . Desa, V. R. 1994

机译：跨模态环境结构的无监督分类学习

Unsupervised multi-instance learning for protein structure determination

摘要

著录项

相似文献

相关主题

期刊订阅