首页> 外文期刊>IEEE Transactions on Pattern Analysis and Machine Intelligence >Semi-Supervised Discriminative Classification Robust to Sample-Outliers and Feature-Noises
【24h】

Semi-Supervised Discriminative Classification Robust to Sample-Outliers and Feature-Noises

机译:对样本离群值和特征噪声具有鲁棒性的半监督区分分类

获取原文
获取原文并翻译 | 示例

摘要

Discriminative methods commonly produce models with relatively good generalization abilities. However, this advantage is challenged in real-world applications (e.g., medical image analysis problems), in which there often exist outlier data points (sample-outliers) and noises in the predictor values (feature-noises). Methods robust to both types of these deviations are somewhat overlooked in the literature. We further argue that denoising can be more effective, if we learn the model using all the available labeled and unlabeled samples, as the intrinsic geometry of the sample manifold can be better constructed using more data points. In this paper, we propose a semi-supervised robust discriminative classification method based on the least-squares formulation of linear discriminant analysis to detect sample-outliers and feature-noises simultaneously, using both labeled training and unlabeled testing data. We conduct several experiments on a synthetic, some benchmark semi-supervised learning, and two brain neurodegenerative disease diagnosis datasets (for Parkinson's and Alzheimer's diseases). Specifically for the application of neurodegenerative diseases diagnosis, incorporating robust machine learning methods can be of great benefit, due to the noisy nature of neuroimaging data. Our results show that our method outperforms the baseline and several state-of-the-art methods, in terms of both accuracy and the area under the ROC curve.
机译:判别方法通常会产生具有相对良好泛化能力的模型。但是,在实际应用中(例如医学图像分析问题),这一优势受到挑战,在这些应用中,通常存在异常数据点(样本异常值)和预测值中的噪声(特征噪声)。对于这两种类型的偏差都有效的方法在文献中被忽略了。我们进一步认为,如果我们使用所有可用的标记和未标记样本学习模型,则去噪会更有效,因为可以使用更多数据点更好地构造样本流形的内在几何形状。在本文中,我们提出了一种基于线性判别分析的最小二乘公式的半监督鲁棒判别分类方法,可以同时使用标记的训练数据和未标记的测试数据来检测样本离群值和特征噪声。我们在一个合成的,一些基准的半监督学习和两个大脑神经退行性疾病诊断数据集(针对帕金森氏症和阿尔茨海默氏病)上进行了一些实验。特别是对于神经退行性疾病的诊断应用,由于神经影像数据的嘈杂性质,结合强大的机器学习方法可能会带来极大的好处。我们的结果表明,就准确性和ROC曲线下面积而言,我们的方法优于基线方法和几种最新方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号