首页> 外文期刊>ChemMedChem >Nonlinear Dimensionality Reduction for Visualizing Toxicity Data: Distance-Based Versus Topology-Based Approaches
【24h】

Nonlinear Dimensionality Reduction for Visualizing Toxicity Data: Distance-Based Versus Topology-Based Approaches

机译:可视化毒性数据的非线性降维:基于距离的对基于拓扑的方法

获取原文
获取原文并翻译 | 示例
       

摘要

Over the years, a number of dimensionality reduction techniques have been proposed and used in chemoinformatics to perform nonlinear mappings. In this study, four representatives of nonlinear dimensionality reduction methods related to two different families were analyzed: distance-based approaches (Isomap and Diffusion Maps) and topology-based approaches (Generative Topographic Mapping (GTM) and Laplacian Eigen-maps). The considered methods were applied for the visualization of three toxicity datasets by using four sets of descriptors. Two methods, GTM and Diffusion Maps, were identified as the best approaches, which thus made it impossible to prioritize a single family of the considered dimensionality reduction methods. The intrinsic dimensionality assessment of data was performed by using the Maximum Likelihood Estimation. It was observed that descriptor sets with a higher intrinsic dimensionality contributed maps of lower quality. A new statistical coefficient, which combines two previously known ones, was proposed to automatically rank the maps. Instead of relying on one of the best methods, we propose to automatically generate maps with different parameter values for different descriptor sets. By following this procedure, the maps with the highest values of the introduced statistical coefficient can be automatically selected and used as a starting point for visual inspection by the user.
机译:多年来,已经提出了许多降维技术,并在化学信息学中用于执行非线性映射。在这项研究中,分析了与两个不同族相关的非线性降维方法的四个代表:基于距离的方法(Isomap和扩散图)和基于拓扑的方法(生成式地形图(GTM)和拉普拉斯特征图。通过使用四组描述符,将考虑的方法应用于三个毒性数据集的可视化。最好的方法是使用GTM和扩散图这两种方法,因此无法对所考虑的降维方法的单个系列进行优先排序。数据的固有维数评估是通过使用最大似然估计进行的。观察到,具有较高固有维数的描述符集有助于降低质量。提出了一个新的统计系数,该系数结合了两个先前已知的统计系数,可以对地图进行自动排名。代替依赖最佳方法之一,我们建议自动为不同的描述符集生成具有不同参数值的映射。通过遵循此过程,可以自动选择具有最高引入统计系数值的地图,并将其用作用户进行视觉检查的起点。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号