首页> 外文期刊>QSAR & combinatorial science >Understanding the aquatic toxicity of pesticide: Structure-activity relationship and molecular descriptors to distinguish the ratings of toxicity
【24h】

Understanding the aquatic toxicity of pesticide: Structure-activity relationship and molecular descriptors to distinguish the ratings of toxicity

机译:了解农药的水生毒性:结构-活性关系和分子描述符以区分毒性等级

获取原文
获取原文并翻译 | 示例
       

摘要

The purpose of this work is to develop robust, interpretable structure-activity relationship (SAR) models for assessing the aquatic toxicity of pesticides. A data set of 1600 chemicals involving 533 nontoxic (C0), 287 slightly toxic (C1), 329 moderately toxic (C2), 231 highly toxic (C3), and 220 very highly toxic compounds (C4) to aquatic organisms were collected in this work. Their chemical structures were encoded into 196 molecular descriptors including the 2D topological, electrotopological state variables as well as the MlogP and AlogP parameters. Two variable selection techniques, i.e., the Stepwise procedure and the Genetic Algorithms (GA), coupled with the linear discriminant analysis (LDA) were used to obtain stable and thoroughly validated QSARs. Our results reveal that the AlogP is capable of classifying the C0 versus C4 compounds with an accuracy rate of 70.4%, but is poor between other groups, while the MlogP does not show any pronounced correlation for aquatic toxicity for all the groups. By using all the theoretical descriptors, the GALDA models for C(0,4) C(1,3), C(1,4), and C(2,4) classifications are acceptable with external prediction accuracies ranging from 66.3% to 80.6%. All these selected descriptors accounting for the molecular size, electrotopological state, and hydrophobicity were found to be crucial to modeling the aquatic toxicity. The robustness and the predictive performance of the proposed models were verified using both the internal (cross-validation by leave-one out, Y-scrambling) and external statistical validations (randomly selected). Our results demonstrate that the Genetic Algorithms have a huge advantage over the Stepwise procedure for generating more reliable models, but by using much less descriptors for all the data sets.
机译:这项工作的目的是建立强大的,可解释的构效关系(SAR)模型来评估农药的水生毒性。收集了1600种化学物质的数据集,涉及533种无毒(C0),287种轻度毒性(C1),329种中度毒性(C2),231种高毒性(C3)和220种对水生生物的高毒性化合物(C4)。工作。它们的化学结构被编码到196个分子描述符中,包括2D拓扑,电拓扑状态变量以及MlogP和AlogP参数。两种变量选择技术,即逐步过程和遗传算法(GA),以及线性判别分析(LDA),被用于获得稳定且经过充分验证的QSAR。我们的结果表明,AlogP能够以70.4%的准确率对C0和C4化合物进行分类,但在其他组之间比较差,而MlogP并未对所有组显示出明显的水生毒性相关性。通过使用所有理论描述符,C(0,4),C(1,3),C(1,4)和C(2,4)分类的GALDA模型是可以接受的,外部预测精度范围为66.3%至80.6%。发现所有这些选择的描述分子大小,电拓扑状态和疏水性的描述子对于模拟水生毒性至关重要。使用内部(通过留一法交叉验证,Y加扰)和外部统计验证(随机选择)验证了所提出模型的鲁棒性和预测性能。我们的结果表明,遗传算法与逐步过程相比,在生成更可靠的模型方面具有巨大优势,但是对于所有数据集使用的描述符要少得多。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号