首页> 外文学位 >Analysis methods for large batch and process data sets: Theory and applications.
【24h】

Analysis methods for large batch and process data sets: Theory and applications.

机译:大批量和过程数据集的分析方法:理论和应用。

获取原文
获取原文并翻译 | 示例

摘要

There are many areas of chemistry in which large data sets are produced in distinct groupings (i.e. "batch" data) or large amounts of data need to be continuously evaluated (i.e. "process" data). Two novel techniques that can enhance both batch and process data analysis have been produced: a fractal-based analysis for outlier detection and a wavelet-based data compression that doubles as an accelerator for subsequent multiway data analyses.; Nonlinear spectroscopic data poses challenges to typical spectral analyses, not the least of which being in the automatic detection of spectra that deviate greatly from a predetermined norm. Such "outliers" should be easy to discriminate against during preprocessing using straightforward multivariate data processing tools, but data nonlinearity renders common outlier diagnostics (based on Mahalanobis distance or score distance tests) inappropriate. To compensate, an outlier detection technique based on the fractal dimension of data sets' score projections has been suggested and effectively employed. Outlying scores in the score space, while not necessarily deviating from overall score clusters in a conventional sense, will cause detectable fluctuations in the cluster's fractal dimension, thus providing a reliable identification trait.; The acquisition, processing, and archiving of large multidimensional data sets require generally undesirable amounts of data storage space and analysis time. To compensate, data compression by means of wavelet transforms has been proposed. Because wavelet transforms are linear (preserving underlying linear factors), chemometric results obtained in the wavelet domain from a compressed cube can be inversely transformed to derive approximated models in the original measurement domain. This technique is effective in increasing data storage capacity and accelerating multiway analysis.
机译:在化学的许多领域中,大数据集以不同的分组(即“批”数据)生成,或者需要连续评估大量数据(即“过程”数据)。已经产生了两种可以增强批处理和过程数据分析能力的新颖技术:一种基于分形的异常检测分析,以及一种基于小波的数据压缩,可以同时用作后续多路数据分析的加速器。非线性光谱数据对典型的光谱分析提出了挑战,其中最重要的是自动检测与预定标准有很大差异的光谱。在使用直接的多元数据处理工具进行预处理期间,应该容易区分这些“异常值”,但是数据非线性使常见的异常值诊断(基于马氏距离或分数距离测试)变得不合适。为了补偿,已经提出并有效地采用了基于数据集分数投影的分形维数的离群值检测技术。分数空间中的外部分数虽然不一定按照常规意义偏离整体分数群集,但会导致群集的分形维数出现可检测的波动,从而提供可靠的识别特征。大型多维数据集的获取,处理和归档通常需要数量不希望的数据存储空间和分析时间。为了补偿,已经提出了借助于小波变换的数据压缩。由于小波变换是线性的(保留基本线性因子),因此可以对小波域中从压缩立方体中获得的化学计量结果进行逆变换,以导出原始测量域中的近似模型。此技术可有效提高数据存储容量并加速多路分析。

著录项

  • 作者

    Cramer, Jeffrey Alan.;

  • 作者单位

    Arizona State University.;

  • 授予单位 Arizona State University.;
  • 学科 Chemistry Analytical.
  • 学位 Ph.D.
  • 年度 2006
  • 页码 134 p.
  • 总页数 134
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 化学;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号