首页> 外文期刊>IEEE Transactions on Fuzzy Systems >Fast and Scalable Approaches to Accelerate the Fuzzy k-Nearest Neighbors Classifier for Big Data
【24h】

Fast and Scalable Approaches to Accelerate the Fuzzy k-Nearest Neighbors Classifier for Big Data

机译:快速和可扩展的方法,以加速模糊K-Collect邻居分类器进行大数据

获取原文
获取原文并翻译 | 示例

摘要

One of the best-known and most effective methods in supervised classification is the k-nearest neighbors algorithm (kNN). Several approaches have been proposed to improve its accuracy, where fuzzy approaches prove to be among the most successful, highlighting the classical fuzzy k-nearest neighbors (FkNN). However, these traditional algorithms fail to tackle the large amounts of data that are available today. There are multiple alternatives to enable kNN classification in big datasets, spotlighting the approximate version of kNN called hybrid spill tree. Nevertheless, the existing proposals of FkNN for big data problems are not fully scalable, because a high computational load is required to obtain the same behavior as the original FkNN algorithm. This article proposes global approximate hybrid spill tree FkNN and local hybrid spill tree FkNN, two approximate approaches that speed up runtime without losing quality in the classification process. The experimentation compares various FkNN approaches for big data with datasets of up to 11 million instances. The results show an improvement in runtime and accuracy over literature algorithms.
机译:监督分类中最熟悉和最有效的方法之一是K-Collest邻居算法(KNN)。已经提出了几种方法来提高其准确性,其中模糊方法被证明是最成功的,突出了古典模糊k最近邻居(FKNN)。但是,这些传统的算法无法应对当今可用的大量数据。有多种替代方案可以在大数据集中启用KNN分类,分散为称为混合泄漏树的KNN的近似版本。尽管如此,对于大数据问题的FKNN的现有提案不完全可扩展,因为需要高计算负荷来获得与原始FKNN算法相同的行为。本文提出了全球近似混合泄漏树FKNN和局部混合泄漏树FKNN,两个近似方法加速运行时在分类过程中没有失去质量。该实验将各种FKNN方法与大约1100万实例的数据集进行大数据。结果显示了在文献算法上的运行时间和准确性的提高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号