...
首页> 外文期刊>Journal of Cheminformatics >AZOrange - High performance open source machine learning for QSAR modeling in a graphical programming environment
【24h】

AZOrange - High performance open source machine learning for QSAR modeling in a graphical programming environment

机译:AZOrange-在图形编程环境中用于QSAR建模的高性能开源机器学习

获取原文
           

摘要

Background Machine learning has a vast range of applications. In particular, advanced machine learning methods are routinely and increasingly used in quantitative structure activity relationship (QSAR) modeling. QSAR data sets often encompass tens of thousands of compounds and the size of proprietary, as well as public data sets, is rapidly growing. Hence, there is a demand for computationally efficient machine learning algorithms, easily available to researchers without extensive machine learning knowledge. In granting the scientific principles of transparency and reproducibility, Open Source solutions are increasingly acknowledged by regulatory authorities. Thus, an Open Source state-of-the-art high performance machine learning platform, interfacing multiple, customized machine learning algorithms for both graphical programming and scripting, to be used for large scale development of QSAR models of regulatory quality, is of great value to the QSAR community. Results This paper describes the implementation of the Open Source machine learning package AZOrange. AZOrange is specially developed to support batch generation of QSAR models in providing the full work flow of QSAR modeling, from descriptor calculation to automated model building, validation and selection. The automated work flow relies upon the customization of the machine learning algorithms and a generalized, automated model hyper-parameter selection process. Several high performance machine learning algorithms are interfaced for efficient data set specific selection of the statistical method, promoting model accuracy. Using the high performance machine learning algorithms of AZOrange does not require programming knowledge as flexible applications can be created, not only at a scripting level, but also in a graphical programming environment. Conclusions AZOrange is a step towards meeting the needs for an Open Source high performance machine learning platform, supporting the efficient development of highly accurate QSAR models fulfilling regulatory requirements.
机译:背景技术机器学习具有广泛的应用。特别是,高级的机器学习方法通​​常在定量结构活动关系(QSAR)建模中被越来越多地使用。 QSAR数据集通常包含数以万计的化合物,并且专有数据集和公共数据集的规模正在迅速增长。因此,需要一种计算效率高的机器学习算法,无需广泛的机器学习知识,研究人员便可以轻松使用这些算法。在授予透明性和可再现性的科学原则时,监管机构越来越认可开放源代码解决方案。因此,用于图形化编程和脚本编制的多种定制的机器学习算法接口的开源最新型高性能机器学习平台具有巨大的价值,该模型将用于监管质量的QSAR模型的大规模开发到QSAR社区。结果本文描述了开源机器学习包AZOrange的实现。 AZOrange是专门开发来支持QSAR模型的批量生成,以提供QSAR建模的完整工作流程,从描述符计算到自动模型构建,验证和选择。自动化工作流程依赖于机器学习算法的定制和通用的自动化模型超参数选择过程。接口了几种高性能的机器学习算法,用于对统计方法进行有效的数据集特定选择,从而提高了模型的准确性。使用AZOrange的高性能机器学习算法不需要编程知识,因为不仅可以在脚本级别,而且可以在图形编程环境中创建灵活的应用程序。结论AZOrange是满足开放源高性能机器学习平台需求的一步,支持高效开发满足监管要求的高精度QSAR模型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号