首页> 外文会议>IEEE Smart World Congress >Mining the Critical Conditions for New Hypotheses of Materials from Historical Reaction Data
【24h】

Mining the Critical Conditions for New Hypotheses of Materials from Historical Reaction Data

机译:从历史反应数据开采新假设新假设的临界条件

获取原文

摘要

The new findings in material science often require a high research cost for the following two aspects. First is that the chemical reaction craft needs continuous optimization and may consume lots of valuable reactants and apparatus during daily experiments. Second, the success of a designed experiment relies heavily on researchers' experience. With the starting of the Materials Genome Initiative (MGI) project, researchers are beginning to record historical reaction data, and seek new solutions via computer techniques, such as data mining and machine learning. In this paper, we study the reaction data of inorganic-organic hybrid materials from the Dark Reaction Project from Haverford College with simple machine learning algorithms (i.e., Bayes Net, SVM and C4.5), ensemble learning models (i.e., Random Forest, Stacking, Gradient Boosting Decision Tree (GBDT) and XGBoost), and deep neural network models. Besides accuracy of the prediction models, we also analyze the reaction conditions that have important reflecting in chemistry with different ranking algorithms. With a series of evaluation, we find that the welldesigned stacking-based ensemble learning model can reach the highest prediction accuracy of 61% (8% higher than GBDT and 5% higher than XGBoost) on the top50 subsets based on 'symmetrical uncertainty ranking' on the standalone data set which was not used in the Dark Reaction Project before.
机译:在材料科学的新发现往往需要在以下两个方面很高的研究成本。首先是化学反应工艺需要不断优化,在日常的实验可能会占用大量宝贵的反应物和设备。其次,设计实验的成功在很大程度上依赖于研究者的经验。随着材料基因组计划(MGI)项目的启动,研究人员开始记录历史数据的反应,并寻求通过计算机技术,如数据挖掘和机器学习的新的解决方案。在本文中,我们研究从暗反应项目从哈弗福德学院用简单的机器学习算法(即,贝叶斯网络,支持向量机和C4.5),集成学习模型(即,随机森林的无机 - 有机杂化材料的反应数据,堆叠,梯度推进决策树(GBDT)和XGBoost),并深层神经网络模型。除了预测模型的精确度,我们也分析了反应条件,具有重要的化学与不同的排名算法反映。随着一系列的评价,我们发现,设计良好的基于​​堆叠,集成学习模型可以基于“对称的不确定性居”的TOP50子集(比GBDT高8%,比XGBoost高5%),达到最高的预测准确度61%对未能在之前的暗反应项目使用独立的数据集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号