首页> 美国卫生研究院文献>Proceedings of the National Academy of Sciences of the United States of America >Efficient denoising algorithms for large experimental datasets and their applications in Fourier transform ion cyclotron resonance mass spectrometry
【2h】

Efficient denoising algorithms for large experimental datasets and their applications in Fourier transform ion cyclotron resonance mass spectrometry

机译:大型实验数据集的高效去噪算法及其在傅里叶变换离子回旋共振质谱中的应用

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Modern scientific research produces datasets of increasing size and complexity that require dedicated numerical methods to be processed. In many cases, the analysis of spectroscopic data involves the denoising of raw data before any further processing. Current efficient denoising algorithms require the singular value decomposition of a matrix with a size that scales up as the square of the data length, preventing their use on very large datasets. Taking advantage of recent progress on random projection and probabilistic algorithms, we developed a simple and efficient method for the denoising of very large datasets. Based on the QR decomposition of a matrix randomly sampled from the data, this approach allows a gain of nearly three orders of magnitude in processing time compared with classical singular value decomposition denoising. This procedure, called urQRd (uncoiled random QR denoising), strongly reduces the computer memory footprint and allows the denoising algorithm to be applied to virtually unlimited data size. The efficiency of these numerical tools is demonstrated on experimental data from high-resolution broadband Fourier transform ion cyclotron resonance mass spectrometry, which has applications in proteomics and metabolomics. We show that robust denoising is achieved in 2D spectra whose interpretation is severely impaired by scintillation noise. These denoising procedures can be adapted to many other data analysis domains where the size and/or the processing time are crucial.
机译:现代科学研究产生的数据集越来越大且越来越复杂,需要处理专用的数值方法。在许多情况下,光谱数据的分析涉及在进行任何进一步处理之前对原始数据进行去噪。当前有效的降噪算法要求矩阵的奇异值分解具有与数据长度的平方成比例的大小,从而阻止了将其用于非常大的数据集。利用随机投影和概率算法的最新进展,我们开发了一种非常有效的方法,用于对大型数据集进行去噪。基于从数据中随机采样的矩阵的QR分解,与传统的奇异值分解降噪相比,该方法可在处理时间上获得近三个数量级的增益。此过程称为urQRd(未缠绕的随机QR去噪),可大大减少计算机内存占用,并允许将去噪算法应用于几乎无限的数据大小。这些数值工具的效率在高分辨率宽带傅里叶变换离子回旋共振质谱仪的实验数据上得到了证明,该方法已在蛋白质组学和代谢组学中得到应用。我们表明,在2D光谱中实现了强大的降噪,其闪烁噪声严重损害了其解释。这些降噪过程可以适用于大小和/或处理时间至关重要的许多其他数据分析域。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号