...
首页> 外文期刊>European journal of mass spectrometry >Predicting the absence of an unknown compound in a mass spectral database
【24h】

Predicting the absence of an unknown compound in a mass spectral database

机译:预测质谱数据库中的不存在未知化合物

获取原文
获取原文并翻译 | 示例

摘要

Only a small subset of known organic compounds (amenable for gas chromatography/mass spectrometry) is present in the largest mass spectral databases (such as NIST or Wiley). Nevertheless, library search algorithms available in the market are not able to predict the absence of a compound in the database. In the present work, we have tried to implement such prediction by means of supervised classification. Training and validation set contained 1500 and 750 compounds, respectively. Two prediction sets (containing 750 and about 3000 mass spectra) were considered. The easiest-to-use models were built with only one input variable: match factor of the best candidate or InLib factor (both parameters were calculated within MS Search (NIST) software). Multivariate classification models were built by partial least squares discriminant analysis (PLS-DA); match factors of top n candidates were used as input variables. PLS-DA was found to be the most effective approach. The prediction efficiency strongly depended on the 'uniqueness' of mass spectra presented in the test set. PLS-DA model was able to correctly predict the absence of a compound in the database in 29.9% for prediction set #1 and in 74.4% for prediction set #2 (only 1.3% and 2.5% of compounds actually presented in the database were wrongly classified).
机译:只有一小部分已知的有机化合物(适用于气相色谱/质谱)存在于最大质谱数据库(例如NIST或Wiley)中。尽管如此,市场上可用的图书馆搜索算法无法预测数据库中的缺失。在目前的工作中,我们试图通过监督分类实施此类预测。培训和验证集分别包含1500和750个化合物。考虑了两个预测集(含有750和约3000个质谱)。最容易使用的模型是用一个输入变量构建的:最佳候选或Inlib因子的匹配因子(两个参数都在MS搜索(NIST)软件中)。多变量分类模型由部分最小二乘判别分析(PLS-DA)构建;顶部N候选的匹配因子用作输入变量。发现PLS-DA是最有效的方法。预测效率强烈依赖于测试集中呈现的质谱的“独特性”。 PLS-DA模型能够在预测集合#1的预测集合#1中正确预测数据库中的2.9.9%的缺失,用于预测集合#2(仅在数据库中只有1.3%和2.5%的化合物)分类)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号