首页> 美国卫生研究院文献>ACS Omega >Prediction of Compound Profiling Matrices Part II:Relative Performance of Multitask Deep Learning and Random ForestClassification on the Basis of Varying Amounts of Training Data
【2h】

Prediction of Compound Profiling Matrices Part II:Relative Performance of Multitask Deep Learning and Random ForestClassification on the Basis of Varying Amounts of Training Data

机译:复合剖析矩阵的预测第二部分:多任务深度学习和随机森林的相对性能根据训练数据量的不同进行分类

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Currently, there is a high level of interest in deep learning and multitask learning in many scientific fields including the life sciences and chemistry. Herein, we investigate the performance of multitask deep neural networks (MT-DNNs) compared to random forest (RF) classification, a standard method in machine learning, in predicting compound profiling experiments. Predictions were carried out on a large profiling matrix extracted from biological screening data. For model building, submatrices with varying data density of 5–100% were generated to investigate the influence of data sparseness on prediction performance. MT-DNN models were directly compared to RF models, and control calculations were also carried out using single-task DNNs (ST-DNNs). On the basis of compound recall, the performance of ST-DNN was consistently lower than that of the other methods. Compared to RF, MT-DNN models only yielded better prediction performance for individual assays in the profiling matrix when training data were very sparse. However, when the matrix density increased to at least 25–45%, per-assayRF models met or partly exceeded the prediction performance of MT-DNNmodels. When the average performances of RF and MT-DNN over the gridof all targets were compared, MT-DNN was slightly superior to RF,which was a likely consequence of multitask learning. Overall, therewas no consistent advantage of MT-DNN over standard RF classificationin predicting the results of compound profiling assays under varyingconditions. In the presence of very sparse training data, predictionperformance was limited. Under these challenging conditions, MT-DNNwas the preferred approach. When more training data became availableand prediction performance increased, RF performance was not inferiorto MT-DNN.
机译:当前,在包括生命科学和化学在内的许多科学领域中,对深度学习和多任务学习的兴趣很高。在本文中,我们研究了与随机森林(RF)分类(机器学习中的标准方法)相比,多任务深度神经网络(MT-DNN)在预测复合分析实验中的性能。对从生物学筛选数据中提取的大型分析矩阵进行了预测。对于模型构建,生成了具有5-100%的变化数据密度的子矩阵,以研究数据稀疏度对预测性能的影响。将MT-DNN模型直接与RF模型进行比较,并使用单任务DNN(ST-DNN)进行控制计算。在复合召回的基础上,ST-DNN的性能始终低于其他方法。与RF相比,当训练数据非常稀疏时,MT-DNN模型只能对配置文件矩阵中的各个分析产生更好的预测性能。但是,当每次测定的基质密度增加到至少25–45%时射频模型达到或部分超过了MT-DNN的预测性能楷模。当RF和MT-DNN的平均性能超过电网时比较所有目标中,MT-DNN略胜于RF,这可能是多任务学习的结果。总的来说与标准RF分类相比,MT-DNN没有一致的优势预测不同条件下化合物谱分析的结果条件。在训练数据非常稀疏的情况下,进行预测性能受到限制。在这些充满挑战的条件下,MT-DNN是首选方法。当更多培训数据可用时并提高了预测性能,射频性能也不差到MT-DNN。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号