首页> 外文期刊>The Journal of Chemical Physics >Multiresolution persistent homology for excessively large biomolecular datasets
【24h】

Multiresolution persistent homology for excessively large biomolecular datasets

机译:大型生物分子数据集的多分辨率持久同源性

获取原文
获取原文并翻译 | 示例
           

摘要

Although persistent homology has emerged as a promising tool for the topological simplification of complex data, it is computationally intractable for large datasets. We introduce multiresolution persistent homology to handle excessively large datasets. We match the resolution with the scale of interest so as to represent large scale datasets with appropriate resolution. We utilize flexibilityrigidity index to access the topological connectivity of the data set and define a rigidity density for the filtration analysis. By appropriately tuning the resolution of the rigidity density, we are able to focus the topological lens on the scale of interest. The proposed multiresolution topological analysis is validated by a hexagonal fractal image which has three distinct scales. We further demonstrate the proposed method for extracting topological fingerprints from DNA molecules. In particular, the topological persistence of a virus capsid with 273 780 atoms is successfully analyzed which would otherwise be inaccessible to the normal point cloud method and unreliable by using coarse-grained multiscale persistent homology. The proposed method has also been successfully applied to the protein domain classification, which is the first time that persistent homology is used for practical protein domain analysis, to our knowledge. The proposed multiresolution topological method has potential applications in arbitrary data sets, such as social networks, biological networks, and graphs. (C) 2015 AIP Publishing LLC.
机译:尽管持久性同源性已成为简化复杂数据的拓扑结构的一种有前途的工具,但对于大型数据集而言,它在计算上是棘手的。我们引入多分辨率持久同源性以处理过大的数据集。我们将分辨率与感兴趣的比例相匹配,以表示具有适当分辨率的大规模数据集。我们利用柔韧性-刚度指数访问数据集的拓扑连接性,并为过滤分析定义刚度密度。通过适当调整刚度密度的分辨率,我们能够将拓扑透镜聚焦在感兴趣的尺度上。所提出的多分辨率拓扑分析已通过具有三个不同比例的六边形分形图像得到了验证。我们进一步证明了从DNA分子中提取拓扑指纹的建议方法。特别是,成功分析了具有273780个原子的衣壳病毒的拓扑持久性,否则该病毒对于正常点云方法将是不可访问的,并且通过使用粗粒度多尺度持久同源性将不可靠。据我们所知,该方法也已成功应用于蛋白质结构域分类,这是首次将持久同源性用于实际蛋白质结构域分析。所提出的多分辨率拓扑方法在诸如社交网络,生物网络和图形的任意数据集中具有潜在的应用。 (C)2015 AIP Publishing LLC。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号