【24h】

Algorithms for Mining Distance-Based Outliers in Large Datasets

机译:大数据集中基于距离的离群值挖掘算法

获取原文
获取原文并翻译 | 示例

摘要

This paper deals with finding outliers (exceptions) in large, multidimensional datasets. The identification of outliers can lead to the discovery of truly unexpected knowledge in areas such as electronic commerce, credit card fraud, and even the analysis of performance statistics of professional athletes. Existing methods that we have seen for finding outliers in large datasets can only deal efficiently with two dimensions/attributes of a dataset. Here, we study the notion of DB- (Distance-Based) outliers. While we provide formal and empirical evidence showing the usefulness of DB-outliers, we focus on the development of algorithms for computing such outliers.
机译:本文涉及在大型多维数据集中查找离群值(异常)。异常值的识别可以导致在电子商务,信用卡欺诈甚至职业运动员的表现统计分析等领域发现真正出乎意料的知识。我们已经看到的用于在大型数据集中查找异常值的现有方法只能有效地处理数据集的两个维度/属性。在这里,我们研究基于DB的离群值的概念。尽管我们提供了证明数据库异常值有用的正式和经验证据,但我们仍专注于计算此类异常值的算法的开发。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号