首页> 外文会议>International Conference on Bioinformatics and Computational Biology >An Evaluation of Different Clustering Methods and Distance Measures Used for Grouping Metabolic Pathways
【24h】

An Evaluation of Different Clustering Methods and Distance Measures Used for Grouping Metabolic Pathways

机译:用于分组代谢途径的不同聚类方法和距离措施的评价

获取原文

摘要

Large-scale annotated metabolic databases, such as KEGG and MetaCyc, provide a wealth of information to researchers designing novel biosynthetic pathways. However, many metabolic pathfinding tools that assist in identifying possible solution pathways fail to facilitate the grouping and interpretation of these pathway results. Clustering possible solution pathways can help users of pathfinding tools quickly identify major patterns and unique pathways without having to sift through individual results one by one. In this paper, we assess the ability of three separate clustering methods (hierarchical, k-means, and k-medoids) along with three pair-wise distance measures (Levenshtein, Jaccard, and n-gram) to expertly group lysine, isoleucine, and 3-hydroxypropanoic acid (3-HP) biosynthesis pathways. The quality of the resulting clusters were quantitatively evaluated against expected pathway groupings taken from the literature. Hierarchical clustering and Levenshtein distance seemed to best match external pathway labels across the three biosynthesis pathways. The lysine biosynthesis pathways, which had the most distinct separation of pathways, had better quality clusters than isoleucine and 3-HP, suggesting that grouping pathways with more complex underlying topologies may require more tailored clustering methods.
机译:大规模注释的代谢数据库,例如KEGG和METICYC,为设计新型生物合成途径的研究人员提供了丰富的信息。然而,有助于识别可能的解决方案途径的许多代谢探测工具未能促进这些途径结果的分组和解释。聚类可能的解决方案路径可以帮助路径限制工具的用户快速识别主要模式和独特的途径,而无需筛选一个逐个筛选各个结果。在本文中,我们评估了三种单独的聚类方法(分层,K-MEARS和K-METOIDS)以及三个对距离措施(Levenshtein,Jactard和N-Gram)的能力,专业地组赖氨酸,异氨酸,和3-羟基丙酸(3-HP)生物合成途径。定量评估所得簇的质量,针对从文献中取出的预期途径分组评估。分层聚类和Levenshtein距离似乎最佳匹配三个生物合成途径的外部路径标签。赖氨酸生物合成途径具有最明显的途径分离,具有比异亮氨酸和3-HP的质量簇更好,表明具有更复杂的拓扑拓扑的分组途径可能需要更定制的聚类方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号