首页> 外文期刊>BMC Bioinformatics >Implementation of 3D spatial indexing and compression in a large-scale molecular dynamics simulation database for rapid atomic contact detection
【24h】

Implementation of 3D spatial indexing and compression in a large-scale molecular dynamics simulation database for rapid atomic contact detection

机译:在大规模分子动力学模拟数据库中实现3D空间索引和压缩,以快速进行原子接触检测

获取原文
           

摘要

Background Molecular dynamics (MD) simulations offer the ability to observe the dynamics and interactions of both whole macromolecules and individual atoms as a function of time. Taken in context with experimental data, atomic interactions from simulation provide insight into the mechanics of protein folding, dynamics, and function. The calculation of atomic interactions or contacts from an MD trajectory is computationally demanding and the work required grows exponentially with the size of the simulation system. We describe the implementation of a spatial indexing algorithm in our multi-terabyte MD simulation database that significantly reduces the run-time required for discovery of contacts. The approach is applied to the Dynameomics project data. Spatial indexing, also known as spatial hashing, is a method that divides the simulation space into regular sized bins and attributes an index to each bin. Since, the calculation of contacts is widely employed in the simulation field, we also use this as the basis for testing compression of data tables. We investigate the effects of compression of the trajectory coordinate tables with different options of data and index compression within MS SQL SERVER 2008. Results Our implementation of spatial indexing speeds up the calculation of contacts over a 1 nanosecond (ns) simulation window by between 14% and 90% (i.e., 1.2 and 10.3 times faster). For a 'full' simulation trajectory (51 ns) spatial indexing reduces the calculation run-time between 31 and 81% (between 1.4 and 5.3 times faster). Compression resulted in reduced table sizes but resulted in no significant difference in the total execution time for neighbour discovery. The greatest compression (~36%) was achieved using page level compression on both the data and indexes. Conclusions The spatial indexing scheme significantly decreases the time taken to calculate atomic contacts and could be applied to other multidimensional neighbor discovery problems. The speed up enables on-the-fly calculation and visualization of contacts and rapid cross simulation analysis for knowledge discovery. Using page compression for the atomic coordinate tables and indexes saves ~36% of disk space without any significant decrease in calculation time and should be considered for other non-transactional databases in MS SQL SERVER 2008.
机译:背景技术分子动力学(MD)模拟提供了观察整个大分子和单个原子随时间变化的动力学和相互作用的能力。以实验数据为背景,模拟中的原子相互作用提供了对蛋白质折叠,动力学和功能机制的洞察力。从MD轨迹计算原子相互作用或接触是计算上的需求,并且所需的功随模拟系统的大小呈指数增长。我们在多TB的MD模拟数据库中描述了空间索引算法的实现,该算法显着减少了发现联系人所需的运行时间。该方法应用于Dynameomics项目数据。空间索引,也称为空间哈希,是一种将模拟空间划分为常规大小的容器并将索引归于每个容器的方法。由于联系人的计算已广泛应用于模拟领域,因此我们也将其用作测试数据表压缩的基础。我们调查了MS SQL SERVER 2008中使用数据的不同选项和索引压缩对轨迹坐标表进行压缩的效果。结果我们实现的空间索引在1纳秒(ns)的仿真窗口中将接触的计算速度提高了14%和90%(即快1.2和10.3倍)。对于“完整”的仿真轨迹(51 ns),空间索引可将计算运行时间减少31%至81%(快1.4到5.3倍)。压缩导致减小的表大小,但导致邻居发现的总执行时间没有显着差异。使用页级压缩对数据和索引进行最大压缩(约36%)。结论空间索引方案显着减少了计算原子接触所需的时间,并且可以应用于其他多维邻居发现问题。加速功能可实现联系人的实时计算和可视化,以及用于知识发现的快速跨仿真分析。对原子坐标表和索引使用页面压缩可节省约36%的磁盘空间,而不会显着减少计算时间,因此对于MS SQL SERVER 2008中的其他非事务性数据库,应考虑使用页面压缩。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号