首页> 外文期刊>Journal of applied statistics >Distance-based outlier detection for high dimension, low sample size data
【24h】

Distance-based outlier detection for high dimension, low sample size data

机译:基于距离的远离高尺寸的异常检测,低样本大小数据

获取原文
获取原文并翻译 | 示例
       

摘要

Despite the popularity of high dimension, low sample size data analysis, there has not been enough attention to the sample integrity issue, in particular, a possibility of outliers in the data. A new outlier detection procedure for data with much larger dimensionality than the sample size is presented. The proposed method is motivated by asymptotic properties of high-dimensional distance measures. Empirical studies suggest that high-dimensional outlier detection is more likely to suffer from a swamping effect rather than a masking effect, thus yields more false positives than false negatives. We compare the proposed approaches with existing methods using simulated data from various population settings. A real data example is presented with a consideration on the implication of found outliers.
机译:尽管具有高维度的普及,但低样体大小数据分析,尤其对样本完整性问题没有足够的关注,特别是数据中的异常值可能性。提出了比样本大小更大维度的数据的新异常检测过程。所提出的方法是通过高维距离措施的渐近特性的激励。实证研究表明,高维异常检测更可能遭受淋浴效果而不是掩蔽效果,从而产生比虚假底片更荧光。我们将建议的方法与使用来自各种人口设置的模拟数据进行比较。提出了一个真实的数据示例,以考虑到发现异常值的含义。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号