首页> 外文期刊>Analytical chemistry >Unique Ion Filter: A Data Reduction Tool for GC/MS Data Preprocessing Prior to Chemometric Analysis
【24h】

Unique Ion Filter: A Data Reduction Tool for GC/MS Data Preprocessing Prior to Chemometric Analysis

机译:独特的离子过滤器:用于化学计量分析之前的GC / MS数据预处理的数据缩减工具

获取原文
获取原文并翻译 | 示例
           

摘要

Using raw GC/MS data as the X-block for chemometric modeling has the potential to provide better classification models for complex samples when compared to using the total ion current (TIC), extracted ion chromatograms/profiles (EIC/EIP), or integrated peak tables. However, the abundance of raw GC/MS data necessitates some form of data reduction/feature selection to remove the variables containing primarily noise from the data set. Several algorithms for feature selection exist; however, due to the extreme number of variables (10~6-10~8 variables per chromatogram), the feature selection time can be prolonged and computationally expensive. Herein, we present a new prefilter for automated data reduction of GC/MS data prior to feature selection. This tool, termed unique ion filter (UIF), is a module that can be added after chromatographic alignment and prior to any subsequent feature selection algorithm. The UIF objectively reduces the number of irrelevant or redundant variables in raw GC/MS data, while preserving potentially relevant analytical information. In the m/z dimension, data are reduced from a full spectrum to a handful of unique ions for each chromatographic peak. In the time dimension, data are reduced to only a handful of scans around each peak apex. UIF was applied to a data set of GC/MS data for a variety of gasoline samples to be classified using partial least-squares discriminant analysis (PLS-DA) according to octane rating. It was also applied to a series of chromatograms from casework fire debris analysis to be classified on the basis of whether or not signatures of gasoline were detected. By reducing the overall population of candidate variables subjected to subsequent variable selection, the UIF reduced the total feature selection time for which a perfect classification of all validation data was achieved from 373 to 9 min (98% reduction in computing time). Additionally, the significant reduction in included variables resulted in a concomitant reduction in noise, improving overall model quality. A minimum of two um/z and scan window of three about the peak apex could provide enough information about each peak for the successful PLS-DA modeling of the data as 100% model prediction accuracy was achieved. It is also shown that the application of UIF does not alter the underlying chemical information in the data.
机译:与使用总离子流(TIC),提取离子色谱图/图谱(EIC / EIP)或集成离子色谱法相比,使用原始GC / MS数据作为X块进行化学计量学建模有可能为复杂样品提供更好的分类模型高峰表。但是,大量的原始GC / MS数据需要某种形式的数据缩减/功能选择,才能从数据集中删除主要包含噪声的变量。存在几种用于特征选择的算法。但是,由于变量数量过多(每个色谱图有10〜6-10〜8个变量),因此特征选择时间可能会延长且计算量很大。在此,我们介绍了一种新的预过滤器,用于在特征选择之前自动减少GC / MS数据的数据。这个称为唯一离子过滤器(UIF)的工具是一个模块,可以在色谱比对之后和任何后续特征选择算法之前添加。 UIF客观地减少了原始GC / MS数据中无关或冗余变量的数量,同时保留了可能相关的分析信息。在m / z维中,每个色谱峰的数据从全光谱减少到少数唯一离子。在时间维度上,数据仅减少到每个峰顶点周围的少数扫描。将UIF应用于各种汽油样品的GC / MS数据集,这些样品使用偏最小二乘判别分析(PLS-DA)根据辛烷值进行分类。它也应用于来自案例火灾残骸分析的一系列色谱图,并根据是否检测到汽油的特征进行分类。通过减少要进行后续变量选择的候选变量的总数,UIF将实现所有验证数据完美分类的总特征选择时间从373减少到了9分钟(计算时间减少了98%)。此外,所包含变量的显着减少还导致了噪声的减少,从而改善了整体模型的质量。峰值顶点的最小2 um / z和三个峰值的扫描窗口可以为成功进行数据的PLS-DA建模提供有关每个峰值的足够信息,因为实现了100%的模型预测精度。还表明,UIF的应用不会改变数据中的基础化学信息。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号