【24h】

Hyperspectral Image Classification with Limited Training Data Samples using Feature Subspaces

机译:使用特征子空间的有限训练数据样本进行高光谱图像分类

获取原文
获取原文并翻译 | 示例

摘要

The classification of pixels in hyperspectral imagery is often made more challenging by the availability of only small numbers of samples within training sets. Indeed, it is often the case that the number of training samples per class is smaller, sometimes considerably smaller, than the dimensionality of the problem. Various techniques may be used to mitigate this problem, with regularized discriminant analysis being one method, and schemes which select subspaces of the original problem being another. This paper is concerned with the latter class of approaches, which effectively make the dimensionality of the problem sufficiently small that conventional statistical pattern recognition techniques may be applied. The paper compares classification results produced using three schemes that can tolerate very small training sets. The first is a conventional feature subset selection method using information from scatter matrices to choose suitable features. The second approach uses the random subspace method (RSM), an ensemble classification technique. This method builds many 'basis' classifiers, each using a different randomly selected subspace of the original problem. The classifications produced by the basis classifiers are merged through voting to generate the final output. The final method also builds an ensemble of classifiers, but uses a smaller number to span the feature space in a deterministic way. Again voting is used to merge the individual classifier outputs. In this paper the three feature selection methods are used in conjunction with a variant of the piecewise quadratic classifier. This classifier type is known to produce good results for hyperspectral pixel classification when the training sample sizes are large. The data examined in the paper is the well-known AVIRIS Indian Pines image, a largely agricultural scene containing some difficult to separate classes. Removal of absorption bands has reduced the dimensionality of the data to 200. A two-class classification problem is examined in detail to determine the characteristic performance of the classifiers. In addition, more realistic 7, 13 and 17-class problems are also studied. Results are calculated for a range of training set sizes and a range of feature subset sizes for each classifier type. Where the training set sizes are large, results produced using the selected feature set and a single classifier outperform the ensemble approaches, and tend to continue to improve as the number of features is increased. For the critical per-class sample size, of the order of the dimensionality of the problem, results produced using the selected feature set outperform the random subspace method for all but the largest subspace sizes attempted. For the smaller training samples the best performance is returned by the random subspace method, with the alternative ensemble approach producing competitive results for a smaller range of subspace sizes. The limited performance of the standard feature selection approach for very small samples is a consequence of the poor estimation of the scatter matrices. This, in turn, causes the best features to be missed from the selection. The ensemble approaches used here do not rely on these estimates, and the high degree of correlation between neighbouring features in hyperspectral data allow a large number of 'reasonable' classifiers to be produced. The combination of these classifiers is capable of producing a robust output even in very small sample cases.
机译:高光谱图像中像素的分类通常因训练集中只有少量样本的可用性而变得更具挑战性。确实,通常情况下,每课的训练样本数量比问题的维度少,有时要小得多。可以使用各种技术来减轻该问题,其中正规化判别分析是一种方法,而选择原始问题的子空间的方案是另一种方法。本文关注的是后一类方法,该方法有效地使问题的维数足够小,可以应用常规的统计模式识别技术。本文比较了使用三种可以容许非常小的训练集的方案产生的分类结果。第一种是常规特征子集选择方法,使用来自散布矩阵的信息来选择合适的特征。第二种方法使用整体分类技术随机子空间方法(RSM)。这种方法建立了许多“基础”分类器,每个分类器使用原始问题的不同随机选择子空间。由基础分类器生成的分类通过投票合并以生成最终输出。最终方法还建立了分类器的集合,但是使用较小的数目以确定的方式跨越特征空间。再次使用表决来合并各个分类器输出。在本文中,三种特征选择方法与分段二次分类器的变体结合使用。已知当训练样本大小较大时,此分类器类型可对高光谱像素分类产生良好的结果。本文中检验的数据是著名的AVIRIS印度松树图像,这是一个农业景象,其中包含一些难以区分的类别。去除吸收带将数据的维数减少到200。仔细研究了两类分类问题,以确定分类器的特征性能。此外,还研究了更现实的7、13和17类问题。针对每种分类器类型,针对一系列训练集大小和一系列特征子集大小计算结果。在训练集大小较大的情况下,使用选定特征集和单个分类器产生的结果要优于整体方法,并且随着特征数的增加,趋势会继续改善。对于关键的每类样本大小(问题维数的数量级),对于除尝试使用的最大子空间大小以外的所有子空间大小,使用所选特征集生成的结果均优于随机子空间方法。对于较小的训练样本,随机子空间方法可返回最佳性能,而另选的集成方法可在较小范围的子空间范围内产生竞争性结果。对于非常小的样本,标准特征选择方法的性能有限是散射矩阵估计不佳的结果。反过来,这会导致从选择中错过最佳功能。这里使用的集成方法不依赖于这些估计,并且高光谱数据中相邻特征之间的高度相关性允许生成大量“合理”的分类器。这些分类器的组合即使在非常小的样本情况下也能够产生稳定的输出。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号