...
首页> 外文期刊>Parallel Computing >Enabling scalable and accurate clustering of distributed ligand geometries on supercomputers
【24h】

Enabling scalable and accurate clustering of distributed ligand geometries on supercomputers

机译:在超级计算机上实现可扩展且精确的分布式配体几何形状聚类

获取原文
获取原文并翻译 | 示例
           

摘要

We present an efficient and accurate clustering method for the analysis of protein-ligand docking datasets on large distributed-memory systems. For each ligand conformation in the dataset, our clustering algorithm first extracts relevant geometrical properties and transforms the properties into a single metadata point in the N-dimensional (N-D) space. Then, it performs an N-D clustering on the metadata to search for predominant clusters. Our method avoids the need to move ligand conformations among nodes, because it extracts relevant data properties locally and concurrently. By doing so, we transform the analysis problem (e.g., clustering or classification) into a search for property aggregates. Our analysis shows that when using small computer systems of up to 64 nodes, the performance is not sensitive to data content and distribution. When using larger computer systems of up to 256 nodes the scalability of simulations with strong convergence toward specific geometries is less sensitive to overheads due to the shuffling of metadata information. We also demonstrate that our method of metadata extraction captures the geometrical properties of ligand conformations more effectively and clusters and predicts near-native ligand conformations more accurately than do traditional methods, including the hierarchical clustering and energy-based scoring methods. (C) 2017 Elsevier B.V. All rights reserved.
机译:我们提出了一种高效,准确的聚类方法,用于分析大型分布式内存系统上的蛋白质-配体对接数据集。对于数据集中的每个配体构象,我们的聚类算法首先提取相关的几何属性,然后将这些属性转换为N维(N-D)空间中的单个元数据点。然后,它对元数据执行N-D聚类,以搜索主要聚类。我们的方法避免了在节点之间移动配体构象的需要,因为它可以本地并发地提取相关的数据属性。通过这样做,我们将分析问题(例如,聚类或分类)转换为对属性集合的搜索。我们的分析表明,当使用最多64个节点的小型计算机系统时,性能对数据内容和分布不敏感。当使用多达256个节点的大型计算机系统时,由于元数据信息的混排,具有针对特定几何的强大收敛性的仿真可伸缩性对开销不太敏感。我们还证明,与包括分层聚类和基于能量的评分方法在内的传统方法相比,我们的元数据提取方法可以更有效地捕获配体构象的几何特性,并且可以更准确地进行聚类和预测近天然配体构象。 (C)2017 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号