首页> 外文期刊>Expert Systems with Application >Semi-supervised support vector regression based on self-training with label uncertainty: An application to virtual metrology in semiconductor manufacturing
【24h】

Semi-supervised support vector regression based on self-training with label uncertainty: An application to virtual metrology in semiconductor manufacturing

机译:基于带有标签不确定性自训练的半监督支持向量回归:在半导体制造中的虚拟计量学中的应用

获取原文
获取原文并翻译 | 示例
           

摘要

Dataset size continues to increase and data are being collected from numerous applications. Because collecting labeled data is expensive and time consuming, the amount of unlabeled data is increasing. Semi-supervised learning (SSL) has been proposed to improve conventional supervised learning methods by training from both unlabeled and labeled data. In contrast to classification problems, the estimation of labels for unlabeled data presents added uncertainty for regression problems. In this paper, a semi supervised support vector regression (SS-SVR) method based on self-training is proposed. The proposed method addresses the uncertainty of the estimated labels for unlabeled data. To measure labeling uncertainty, the label distribution of the unlabeled data is estimated with two probabilistic local reconstruction (PLR) models. Then, the training data are generated by oversampling from the unlabeled data and their estimated label distribution. The sampling rate is different based on uncertainty. Finally, expected margin-based pattern selection (EMPS) is employed to reduce training complexity. We verify the proposed method with 30 regression datasets and a real-world problem: virtual metrology (VM) in semiconductor manufacturing. The experiment results show that the proposed method improves the accuracy by 8% compared with conventional supervised SVR, and the training time for the proposed method is 20% shorter than that of the benchmark methods. (C) 2015 Elsevier Ltd. All rights reserved.
机译:数据集的大小持续增加,并且正在从众多应用程序中收集数据。因为收集标记的数据既昂贵又费时,所以未标记的数据量正在增加。已提出半监督学习(SSL),通过从未标记数据和标记数据中进行训练来改进常规的监督学习方法。与分类问题相比,未标记数据的标签估计为回归问题增加了不确定性。提出了一种基于自训练的半监督支持向量回归方法。所提出的方法解决了未标记数据的估计标签的不确定性。为了测量标签的不确定性,可以使用两个概率局部重建(PLR)模型来估计未标签数据的标签分布。然后,通过对未标记的数据及其估计的标记分布进行过采样来生成训练数据。采样率基于不确定性而不同。最后,采用基于期望余量的模式选择(EMPS)来减少训练的复杂性。我们用30个回归数据集和一个实际问题验证了所提出的方法:半导体制造中的虚拟计量(VM)。实验结果表明,与传统的监督SVR相比,该方法的准确性提高了8%,并且训练时间比基准方法缩短了20%。 (C)2015 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号