Distributed heterogeneous ensemble learning on Apache Spark for ligand-based virtual screening

Sid Karima; Batouche Mohamed

首页> 外文期刊>International journal of data mining, modelling and management >Distributed heterogeneous ensemble learning on Apache Spark for ligand-based virtual screening

【24h】

Distributed heterogeneous ensemble learning on Apache Spark for ligand-based virtual screening

机译：基于配体的虚拟筛选的Apache Spark上分布式异构集合学习

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Virtual screening is one of the most common computer-aided drug design techniques that apply computational tools and methods on large libraries of molecules to extract the drugs. Ensemble learning is a recent paradigm launched to improve machine learning results in terms of predictive performance and robustness. It has been successfully applied in ligand-based virtual screening (LBVS) approaches. Applying ensemble learning on huge molecular libraries is computationally expensive. Hence, the distribution and parallelisation of the task have become a significant step by using sophisticated frameworks such as Apache Spark. In this paper, we propose a new approach HEnsL_DLBVS, for heterogeneous ensemble learning, distributed on Spark to improve the large-scale LBVS results. To handle the problem of imbalanced big training datasets, we propose a novel hybrid technique. We generate new training datasets to evaluate the approach. Experimental results confirm the effectiveness of our approach with satisfactory accuracy and its superiority over homogeneous models.

机译：虚拟筛选是最常见的计算机辅助药物设计技术之一，适用在大型分子文库上应用计算工具和方法以提取药物。集合学习是最近推出的范式，以改善机器学习在预测性能和稳健性方面的结果。它已成功应用于基于配体的虚拟筛选（LBV）方法。在巨大的分子库上应用集合学习是计算昂贵的。因此，任务的分布和平行于通过使用诸如Apache Spark等复杂的框架成为重要的一步。在本文中，我们提出了一种新的方法HENSL_DLBV，用于异构集合学习，分布在火花上，以提高大规模的LBVS结果。为了处理更加培训数据集的不平衡问题，我们提出了一种新颖的混合技术。我们生成新的培训数据集以评估方法。实验结果证实了我们对令人满意的精度及其在均匀模型的优越性的效果。

著录项

来源
《International journal of data mining, modelling and management》 |2021年第2期|160-191|共32页
作者
Sid Karima; Batouche Mohamed;
展开▼
作者单位

Constantine 2 Univ Abdelhamid Mehri Dept Comp Sci Constantine Algeria;

Princess Nourah Univ Dept Informat Technol CCIS RC Riyadh Saudi Arabia;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
virtual screening; big data; computer-aided drug design; CADD; Apache Spark; machine learning; drug discovery; ensemble learning; imbalanced datasets; Spark MLlib; ligand-based virtual screening; LBVS;

机译：虚拟筛选;大数据;计算机辅助药物设计;CADD;Apache Spark;机器学习;药物发现;合奏学习;不平衡的数据集;Spark Mllib;基于Ligand的虚拟筛选;LBV;

相似文献

外文文献
中文文献
专利

1. Ligand-Based Virtual Screening Using Tailored Ensembles: A Prioritization Tool for Dual A(2A) Adenosine Receptor Antagonists/Monoamine Oxidase B Inhibitors [J] . Morales Helguera Aliuska, Perez-Castillo Yunierkis, Cordeiro M. Natalia D. S., Current pharmaceutical design . 2016,第21期

机译：基于配体的基于配体的虚拟筛选：双A（2A）腺苷受体拮抗剂/单胺氧化酶B抑制剂的优先排序工具
2. Integrating ligand-based and protein-centric virtual screening of kinase inhibitors using ensembles of multiple protein kinase genes and conformations [J] . Dixit A., Verkhivker G.M. Journal of chemical information and modeling . 2012,第10期

机译：使用多个蛋白激酶基因和构象的整合体，对激酶抑制剂进行基于配体的和蛋白中心的虚拟筛选
3. Heterogeneous classifier fusion for ligand-based virtual screening: Or, how decision making by committee can be a good thing [J] . Riniker S., Fechner N., Landrum G.A. Journal of chemical information and modeling . 2013,第11期

机译：基于配体的虚拟筛选的异构分类器融合：或者，委员会的决策如何可能是一件好事
4. Ensemble Learning for Large Scale Virtual Screening on Apache Spark [C] . Karima Sid, Mohamed Batouche Computational intelligence and its applications . 2018

机译：在Apache Spark上进行大规模虚拟筛选的集成学习
5. Performance Evaluation of Machine Learning Algorithms in Apache Spark for Intrusion Detection [D] . Dobson, Anthony M. 2018

机译：用于入侵检测的Apache Spark中机器学习算法的性能评估
6. Efficient iterative virtual screening with Apache Spark and conformal prediction [O] . Laeeq Ahmed, Valentin Georgiev, Marco Capuccini, 2018

机译：使用Apache Spark和共形预测进行有效的迭代虚拟筛选
7. Large-scale virtual screening on public cloud resources with Apache Spark [O] . 2017

机译：使用Apache Spark对公共云资源进行大规模虚拟筛选

Distributed heterogeneous ensemble learning on Apache Spark for ligand-based virtual screening

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅