首页> 美国卫生研究院文献>Scientific Reports >Machine learning approaches for large scale classification of produce
【2h】

Machine learning approaches for large scale classification of produce

机译:机器学习方法用于农产品的大规模分类

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The analysis and identification of different attributes of produce such as taxonomy, vendor, and organic nature is vital to verifying product authenticity in a distribution network. Though a variety of analysis techniques have been studied in the past, we present a novel data-centric approach to classifying produce attributes. We employed visible and near infrared (NIR) spectroscopy on over 75,000 samples across several fruit and vegetable varieties. This yielded 0.90–0.98 and 0.98–0.99 classification accuracies for taxonomy and farmer classes, respectively. The most significant factors in the visible spectrum were variations in the produce color due to chlorophyll and anthocyanins. In the infrared spectrum, we observed that the varying water and sugar content levels were critical to obtaining high classification accuracies. High quality spectral data along with an optimal tuning of hyperparameters in the support vector machine (SVM) was also key to achieving high classification accuracies. In addition to demonstrating exceptional accuracies on test data, we explored insights behind the classifications, and identified the highest performing approaches using cross validation. We presented data collection guidelines, experimental design parameters, and machine learning optimization parameters for the replication of studies involving large sample sizes.
机译:对农产品的不同属性(例如分类法,供应商和有机性质)进行分析和识别对于验证分销网络中的产品真实性至关重要。尽管过去已经研究了各种分析技术,但我们提出了一种以数据为中心的新颖方法来对农产品属性进行分类。我们对几种水果和蔬菜品种的75,000多个样品进行了可见光和近红外(NIR)光谱分析。这分别为分类法和农民分类产生了0.90–0.98和0.98–0.99的分类精度。可见光谱中最重要的因素是由于叶绿素和花青素引起的产品颜色变化。在红外光谱中,我们观察到水和糖含量的变化对获得高分类精度至关重要。高质量的光谱数据以及支持向量机(SVM)中超参数的最佳调整也是实现高分类精度的关键。除了展示测试数据的卓越准确性外,我们还探索了分类背后的见解,并使用交叉验证确定了性能最高的方法。我们提出了数据收集准则,实验设计参数和机器学习优化参数,以用于复制涉及大样本量的研究。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号