...
首页> 外文期刊>Chemometrics and Intelligent Laboratory Systems >Efficient feature selection for mass spectrometry based electronic nose applications
【24h】

Efficient feature selection for mass spectrometry based electronic nose applications

机译:基于质谱的电子鼻应用的有效功能选择

获取原文
获取原文并翻译 | 示例
           

摘要

High dimensionality is inherent to MS-based electronic nose applications where hundreds of variables per measurement (m/z fragments) - a significant number of them being highly correlated or noisy - are available. Feature selection is, therefore, an unavoidable pre-processing step if robust and parsimonious pattern classification models are to be developed. In this article, a new strategy for feature selection has been introduced and its good performance demonstrated using two MS e-nose databases. The feature selection is conducted in three steps. The first two steps are aimed at removing noisy, non-informative and highly collinear features (i.e., redundant), respectively. These two steps are computationally inexpensive and allow for dramatically reducing the number of variables (near 80percent of initially available features are eliminated after the second step). The third step makes use of a stochastic variable selection method (simulated annealing) to further reduce the number of variables. For example, applying the method to an Iberian ham database has resulted in the number of features being reduced from 209 down to 14. Using the surviving m/z fragments, a fuzzy ARTMAP classifier was able to sort ham samples according to producer and quality (11-category classification) with a 97.24percent success rate. The whole feature selection process runs in a few minutes in a Pentium IV PC platform.
机译:高维是基于MS的电子鼻应用程序所固有的,在该应用程序中,每次测量都有数百个变量(m / z片段)可用-其中很大一部分是高度相关或嘈杂的。因此,如果要开发健壮和简约的模式分类模型,则特征选择是不可避免的预处理步骤。本文介绍了一种新的特征选择策略,并使用两个MS e-nose数据库展示了其良好的性能。功能选择分三个步骤进行。前两个步骤分别旨在消除嘈杂的,非信息性的和高度共线的特征(即冗余)。这两个步骤在计算上不昂贵,并且可以显着减少变量的数量(第二个步骤后消除了将近80%的初始可用功能)。第三步利用随机变量选择方法(模拟退火)进一步减少变量的数量。例如,将该方法应用于伊比利亚火腿数据库,可以将特征数量从209减少到14。使用尚存的m / z片段,模糊ARTMAP分类器能够根据生产者和质量对火腿样品进行分类( 11类分类),成功率为97.24%。整个功能选择过程将在Pentium IV PC平台上运行几分钟。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号