首页> 外文期刊>Journal of Molecular Biology >MRMD3.0: A Python Tool and Webserver for Dimensionality Reduction and Data Visualization via an Ensemble Strategy

MRMD3.0: A Python Tool and Webserver for Dimensionality Reduction and Data Visualization via an Ensemble Strategy

机译:MRMD3.0:通过集成策略实现降维和数据可视化的 Python 工具和 Web 服务器

获取原文并翻译 | 示例


? 2023 Elsevier LtdDimensionality reduction is a hot topic in machine learning that can help researchers find key features from complex medical or biological data, which is crucial for biological sequence research, drug development, etc. However, when applied to specific datasets, different dimensionality reduction methods generate different results, which produces instability and makes tuning the parameters a time-consuming task. Exploring high quality features, genes, or attributes from complex data is an important task and challenge. To ensure the efficiency, robustness, and accuracy of experiments, in this work, we developed a dimensionality reduction tool MRMD3.0 based on the ensemble strategy of link analysis. It is mainly divided into two steps: first, the ensemble method is used to integrate different feature ranking algorithms to calculate feature importance, and then the forward feature search strategy combined with cross-validation is used to explore the proper feature combination. Compared with the previously developed version, MRMD3.0 has added more link-based ensemble algorithms, including PageRank, HITS, LeaderRank, and TrustRank. At the same time, more feature ranking algorithms have been added, and their effect and calculation speed have been greatly improved. In addition, the newest version provides an interface used by each feature ranking method and five kinds of charts to help users analyze features. Finally, we also provide an online webserver to help researchers analyze the data. Availability and implementation Webserver: http://lab.malab.cn/soft/MRMDv3/home.html. GitHub: https://github.com/heshida01/MRMD3.0.
机译:?2023 爱思唯尔有限公司降维是机器学习中的一个热门话题,可以帮助研究人员从复杂的医学或生物数据中找到关键特征,这对于生物序列研究、药物开发等至关重要。然而,当应用于特定数据集时,不同的降维方法会产生不同的结果,这会产生不稳定性,并使调整参数成为一项耗时的任务。从复杂数据中探索高质量的特征、基因或属性是一项重要的任务和挑战。为了保证实验的效率、鲁棒性和准确性,本文基于链路分析的集成策略,开发了一种降维工具MRMD3.0。主要分为两步:首先,利用集成法整合不同的特征排序算法来计算特征重要性,然后采用前向特征搜索策略结合交叉验证来探索合适的特征组合。与之前开发的版本相比,MRMD3.0增加了更多基于链接的集成算法,包括PageRank、HITS、LeaderRank和TrustRank。同时,增加了更多的特征排序算法,其效果和计算速度都得到了极大的提升。此外,最新版本提供了每种特征排名方法使用的界面和五种图表,以帮助用户分析特征。最后,我们还提供了一个在线网络服务器来帮助研究人员分析数据。可用性和实施 Webserver:http://lab.malab.cn/soft/MRMDv3/home.html。GitHub:https://github.com/heshida01/MRMD3.0。




京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号