...
首页> 外文期刊>Metabolomics >Gaussian binning: a new kernel-based method for processing NMR spectroscopic data for metabolomics
【24h】

Gaussian binning: a new kernel-based method for processing NMR spectroscopic data for metabolomics

机译:高斯分箱:一种基于核的新方法,用于处理代谢组学的NMR光谱数据

获取原文
获取原文并翻译 | 示例
           

摘要

In many metabolomics studies, NMR spectra are divided into bins of fixed width. This spectral quantification technique, known as uniform binning, is used to reduce the number of variables for pattern recognition techniques and to mitigate effects from variations in peak positions; however, shifts in peaks near the boundaries can cause dramatic quantitative changes in adjacent bins due to non-overlapping boundaries. Here we describe a new Gaussian binning method that incorporates overlapping bins to minimize these effects. A Gaussian kernel weights the signal contribution relative to distance from bin center, and the overlap between bins is controlled by the kernel standard deviation. Sensitivity to peak shift was assessed for a series of test spectra where the offset frequency was incremented in 0.5 Hz steps. For a 4 Hz shift within a bin width of 24 Hz, the error for uniform binning increased by 150%, while the error for Gaussian binning increased by 50%. Further, using a urinary metabolomics data set (from a toxicity study) and principal component analysis (PCA), we showed that the information content in the quantified features was equivalent for Gaussian and uniform binning methods. The separation between groups in the PCA scores plot, measured by the J 2 quality metric, is as good or better for Gaussian binning versus uniform binning. The Gaussian method is shown to be robust in regards to peak shift, while still retaining the information needed by classification and multivariate statistical techniques for NMR-metabolomics data.
机译:在许多代谢组学研究中,NMR光谱分为固定宽度的条带。这种频谱量化技术称为统一合并,用于减少模式识别技术的变量数量,并减轻峰值位置变化的影响。但是,由于边界不重叠,边界附近的峰移动可能会导致相邻箱中的数量发生重大变化。在这里,我们描述了一种新的高斯分箱方法,该方法合并了重叠的分箱以最小化这些影响。高斯内核对信号贡献相对于距bin中心距离的权重进行加权,bin之间的重叠由内核标准偏差控制。在一系列测试频谱中评估了对峰移的灵敏度,其中偏移频率以0.5 Hz的步长递增。对于24 Hz的条带宽度内的4 Hz偏移,均匀合并的误差增加了150%,而高斯合并的误差增加了50%。此外,使用尿液代谢组学数据集(来自毒性研究)和主成分分析(PCA),我们显示出量化特征中的信息内容与高斯和统一装箱方法等效。通过J 2 质量度量标准对PCA分数图中的各组之间的分隔,无论是高斯分箱还是均匀分箱,其效果都一样好。高斯方法显示出在峰移动方面的鲁棒性,同时仍保留NMR代谢组学数据的分类和多元统计技术所需的信息。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号