...
首页> 外文期刊>Open Journal of Statistics >Dimensionality Reduction of High-Dimensional Highly Correlated Multivariate Grapevine Dataset
【24h】

Dimensionality Reduction of High-Dimensional Highly Correlated Multivariate Grapevine Dataset

机译:高维高度相关多元葡萄数据集的维数减少

获取原文

摘要

Viticulturists traditionally have a keen interest in studying the relationship between the biochemistry of grapevines’ leaves/petioles and their associated spectral reflectance in order to understand the fruit ripening rate, water status, nutrient levels, and disease risk. In this paper, we implement imaging spectroscopy (hyperspectral) reflectance data, for the reflective 330 style="font-family:;" "=""> style="font-family:Verdana;">- style="font-family:;" "=""> style="font-family:Verdana;">2510 nm wavelength region (986 total spectral bands), to assess vineyard nutrient status; this constitutes a high dimensional dataset with a covariance matrix that is ill-conditioned. The identification of the variables (wavelength bands) that contribute useful information for nutrient assessment and prediction, plays a pivotal role in multivariate style="font-family:Verdana;">statistical modeling. In recent years, researchers have successfully developed many continuous, nearly unbiased, sparse and accurate variable selection methods to overcome this problem. This paper compares four regularized and one functional regression methods: Elastic Net, Multi style="font-family:Verdana;">- style="font-family:Verdana;">Step Adaptive Elastic Net, Minimax Concave Penalty, iterative Sure Independence Screening, and Functional Data Analysis for wavelength variable selection. Thereafter, the predictive performance of these regularized sparse models is enhanced using the stepwise regression. This comparative study of regression methods using a high-dimensional style="font-family:Verdana;"> and highly correlated grapevine hyperspectral dataset revealed that the performance of Elastic Net for variable selection yields the best predictive ability.
机译:葡萄栽培师传统上具有敏锐的兴趣研究葡萄树叶/叶柄的生物化学与其相关的光谱反射率之间的关系,以了解果实成熟率,水状态,营养水平和疾病风险。在本文中,我们实施了成像光谱(Hyperspectral)反射数据,用于反射330 style =”font-family:verdana;“> - style =”font-family :;“”=“”> 样式=“Font-Family:Verdana;”> 2510 nm波长区域(986总光谱带),评估葡萄园营养状况;这构成了具有不良状态的协方差矩阵的高维数据集。识别有助于营养评估和预测的有用信息的变量(波长带)在多变量 <跨度样式=“font-family:verdana;”>统计建模中起着关键作用。近年来,研究人员已成功开发出许多连续,几乎没有偏见,稀疏和准确的可变选择方法来克服这个问题。本文比较了四种正规化和一种功能回归方法:弹性网,多 <跨度样式=“font-family:verdana;”> - 样式=“font-family:verdana;”>步骤自适应弹性网,Minimax凹陷,迭代确保独立筛选,以及用于波长变量选择的功能数据分析。此后,使用逐步回归增强了这些正则稀疏模型的预测性能。使用高维和跨度> <跨度=“Font-Family:Verdana;”的对比来研究回归方法的比较研究>高度相关的葡萄曲线光谱数据集显示,可变选择的弹性网的性能产生了最佳的预测能力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号