首页> 外文期刊>Journal of Analytical Methods in Chemistry >Tracing Geographical Origins of Teas Based on FT-NIR Spectroscopy: Introduction of Model Updating and Imbalanced Data Handling Approaches
【24h】

Tracing Geographical Origins of Teas Based on FT-NIR Spectroscopy: Introduction of Model Updating and Imbalanced Data Handling Approaches

机译:基于FT-NIR光谱的茶叶地理起源:模型更新和数据处理方法的介绍

获取原文
       

摘要

This work presents a reliable approach to trace teas’ geographical origins despite changes in teas caused by different harvest years. A total of 1447 tea samples collected from various areas in 2014 (660 samples) and 2015 (787 samples) were detected by FT-NIR. Seven classifiers trained on the 2014 dataset all succeeded to trace origins of samples collected in 2014; however, they all failed to predict origins for the 2015 samples due to different data distributions and imbalanced dataset. Three outlier detection based undersampling approaches—one-class SVM (OC-SVM), isolation forest and elliptic envelope—were then proposed; as a result, the highest macro average recall (MAR) for the 2015 dataset was improved from 56.86% to 73.95% (by SVM). A model updating approach was also applied, and the prediction MAR was significantly improved with increase in the updating rate. The best MAR (90.31%) was first achieved by the OC-SVM combined SVM classifier at a 50% rate.
机译:尽管不同收获年造成的茶叶变化,但这项工作呈现了追踪茶叶的地理起源的可靠方法。通过FT-NIR检测来自2014年(660个样品)和2015(787个样品)的各个区域收集的1447个茶样品。 2014年数据集培训的七分类机都成功地追踪2014年收集的样本的起源;但是,由于不同的数据分布和不平衡数据集,它们都无法预测2015年样本的起源。基于三种异常采样探测 - 然后提出了一类SVM(OC-SVM),隔离林和椭圆形包络;因此,2015年数据集的最高宏观平均召回(MAR)从56.86%提高到73.95%(通过SVM)。还应用了模型更新方法,随着更新率的增加,预测MA显着改善。 MAR(90.31%)首先由OC-SVM组合SVM分类器以50%的速率实现。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号