...
首页> 外文期刊>Bioinformatics >Genetic algorithm optimization for pre-processing and variable selection of spectroscopic data
【24h】

Genetic algorithm optimization for pre-processing and variable selection of spectroscopic data

机译:遗传算法优化用于光谱数据的预处理和变量选择

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Motivation: The major difficulties relating to mathematical modelling of spectroscopic data are inconsistencies in spectral reproducibil-ity and the black box nature of the modelling techniques. For the analysis of biological samples the first problem is due to biological, experimental and machine variability which can lead to sample size differences and unavoidable baseline shifts. Consequently, there is often a requirement for mathematical correction(s) to be made to the raw data if the best possible model is to be formed. The second problem prevents interpretation of the results since the variables that most contribute to the analysis are not easily revealed; as a result, the opportunity to obtain new knowledge from such data is lost. Methods: We used genetic algorithms (GAs) to select spectral preprocessing steps for Fourier transform infrared (FT-IR) spectroscopic data. We demonstrate a novel approach for the selection of important discriminatory variables by GA from FT-IR spectra for multi-class identification by discriminant function analysis (DFA). Results: The GA selects sensible pre-processing steps from a total of ~10~(10) possible mathematical transformations. Application of these algorithms results in a 16% reduction in the model error when compared against the raw data model. GA-DFA recovers six variables from the full set of 882 spectral variables against which a satisfactory DFA model can be formed; thus inferences can be made as to the biochemical differences that are reflected by these spectral bands.
机译:动机:与光谱数据的数学建模有关的主要困难是光谱再现性和建模技术的黑匣子性质不一致。对于生物样品的分析,第一个问题是由于生物学,实验和机器的可变性,这可能导致样品尺寸差异和不可避免的基线偏移。因此,如果要形成最佳可能的模型,通常需要对原始数据进行数学校正。第二个问题阻止了结果的解释,因为最难以分析的变量很难被揭示出来。结果,失去了从这些数据获得新知识的机会。方法:我们使用遗传算法(GA)选择傅里叶变换红外(FT-IR)光谱数据的光谱预处理步骤。我们展示了一种新的方法,用于通过FT-IR光谱通过GA选择重要的判别变量,以通过判别函数分析(DFA)进行多类别识别。结果:遗传算法从总共〜10〜(10)个可能的数学转换中选择明智的预处理步骤。与原始数据模型相比,这些算法的应用可使模型误差减少16%。 GA-DFA从完整的882个光谱变量集中恢复了六个变量,可以针对这些变量形成令人满意的DFA模型;因此,可以推断出这些光谱带所反映的生化差异。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号