首页> 外文期刊>Environmental Science & Technology >Predicting Micropollutant Removal by Reverse Osmosis and Nanofiltration Membranes: Is Machine Learning Viable?
【24h】

Predicting Micropollutant Removal by Reverse Osmosis and Nanofiltration Membranes: Is Machine Learning Viable?

机译:通过反渗透和纳滤膜预测微量润肤剂:是机器学习可行吗?

获取原文
获取原文并翻译 | 示例
       

摘要

Predictive models for micropollutant removal by membrane separation are highly desirable for the design and selection of appropriate membranes. While machine learning (ML) models have been applied for such purposes, their reliability might be compromised by data leakage due to inappropriate data splitting. More importantly, whether ML models can truly understand the mechanisms of membrane separation has not been revealed. In this study, we evaluate the capability of the XGBoost model to predict micropollutant removal efficiencies of reverse osmosis and nanofiltration membranes. Our results demonstrate that data leakage leads to falsely high prediction accuracy. By utilizing a model interpretation method based on the cooperative game theory, we test the knowledge of XGBoost on the mechanisms of membrane separation via quantifying the contributions of input variables to the model predictions. We reveal that XGBoost possesses an adequate understanding of size exclusion, but its knowledge of electrostatic interactions and adsorption is limited. Our findings suggest that future work should focus more on avoiding data leakage and evaluating the mechanistic knowledge of ML models. In addition, high-quality data from more diverse experimental conditions, as well as more informative variables, are needed to improve the accuracy of ML models for predicting membrane performance.
机译:对于膜分离的微胶体去除的预测模型非常适合于设计和选择适当的膜。虽然机器学习(ML)模型已用于此目的,但由于不适当的数据拆分,它们的可靠性可能因数据泄漏而受到损害。更重要的是,ML模型是否可以真正理解膜分离的机制尚未揭示。在该研究中,我们评估XGBoost模型的能力来预测反渗透和纳米过滤膜的微核性去除效率。我们的结果表明,数据泄漏导致虚假的预测精度。通过利用基于合作博弈论的模型解释方法,我们通过量化输入变量对模型预测的贡献来测试XGBoost对膜分离机制的知识。我们揭示了XGBoost对大小排斥的充分理解,但其静电相互作用和吸附知识有限。我们的研究结果表明,未来的工作应更多地关注避免数据泄漏并评估ML模型的机械知识。此外,需要从更多样化的实验条件以及更具信息性变量的高质量数据来提高ML模型的准确性,以预测膜性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号