...
首页> 外文期刊>Toxicology: An International Journal Concerned with the Effects of Chemicals on Living Systems >Support vector machine: Classifying and predicting mutagenicity of complex mixtures based on pollution profiles
【24h】

Support vector machine: Classifying and predicting mutagenicity of complex mixtures based on pollution profiles

机译:支持向量机:基于污染特征分类和预测复杂混合物的致突变性

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Powerful, robust in silico approaches offer great promise for classifying and predicting biological effects of complex mixtures and for identifying the constituents of greatest concern. Support vector machine (SVM) methods can deal with high dimensional data and small sample size and examine multiple interrelationships among samples. In this work, we applied SVM methods to examine pollution profiles and mutagenicity of 60 water samples obtained from 6 cities in China during 2006-2011. Pollutant profiles were characterized in water extracts by gas chromatography-mass spectrometry (GC/MS) and mutagenicity examined by Ames assays. We encoded feature vectors of GS-MS peaks in the mixtures and used 48 samples as the training set, reserving 12 samples as the test set. The SVM model and regression were constructed from whole pollution profiles that ranked compounds in relation to their correlation to the mutagenicity. Both classification and prediction performance were evaluated. The SVM model based on whole pollution profiles showed lower performance (sensitivity, specificity, accuracy and correlation coefficient were 69.5-70.7%, 70.6-73.2%, 69.9-72.1%, and 0.55-0.59%, respectively) than one based on compounds with highest association with mutagenicity. A SVM model with the top 10 compounds had the highest performance (sensitivity, specificity, accuracy, and correlation coefficient were 89.8-90.3%, 90.1 -92.1%, 90.1-91.3%, and 0.80-0.82%, respectively), with negligible decreases in performance between the test and training set. SVM can be a powerful, robust classifier of the relationship of pollutants and mutagenicity in complex real-world mixtures. The top 14 compounds have the greatest contribution to mutagenicity and deserve further studies to identify these constituents.
机译:功能强大,功能强大的计算机方法为分类和预测复杂混合物的生物效应以及确定最受关注的成分提供了广阔的前景。支持向量机(SVM)方法可以处理高维数据和小样本量,并检查样本之间的多个相互关系。在这项工作中,我们应用了支持向量机方法来检查2006-2011年间从中国6个城市获得的60个水样的污染概况和致突变性。通过气相色谱-质谱法(GC / MS)表征水提取物中的污染物概况,并通过Ames分析检测致突变性。我们对混合物中GS-MS峰的特征向量进行了编码,并使用48个样本作为训练集,保留了12个样本作为测试集。 SVM模型和回归是根据整个污染概况构建的,该概况对化合物的诱变程度进行了排序。评估了分类和预测性能。基于整体污染特征的SVM模型显示,其性能(灵敏度,特异性,准确性和相关系数分别为69.5-70.7%,70.6-73.2%,69.9-72.1%和0.55-0.59%)低于基于化合物的SVM模型。致突变性最高。具有前10个化合物的SVM模型具有最高的性能(灵敏度,特异性,准确性和相关系数分别为89.8-90.3%,90.1 -92.1%,90.1-91.3%和0.80-0.82%),而下降可忽略不计在测试和培训之间的表现。 SVM可以对复杂的现实世界混合物中的污染物与诱变性之间的关系进行强大,强大的分类。前14种化合物对诱变性的贡献最大,值得进一步研究以鉴定这些成分。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号