首页> 外文会议>SIAM International Conference on Data Mining >Massive-Scale Kernel Discriminant Analysis: Mining for Quasars
【24h】

Massive-Scale Kernel Discriminant Analysis: Mining for Quasars

机译:大规模的核心判别分析:挖掘四态

获取原文
获取外文期刊封面目录资料

摘要

We describe a fast algorithm for kernel discriminant analysis, empirically demonstrating asymptotic speed-up over the previous best approach. We achieve this with a new pattern of processing data stored in hierarchical trees, which incurs low overhead while helping to prune unnecessary work once classification results can be shown, and the use of the Epanechnikov kernel, which allows additional pruning between portions of data shown to be far apart or very near each other. Further, our algorithm may share work between multiple simultaneous bandwidth computations, thus facilitating a rudimentary but nonetheless quick and effective means of bandwidth optimization. We apply a parallelized implementation of our algorithm to a large data set (40 million points in 4D) from the Sloan Digital Sky Survey, identifying approximately one million quasars with high accuracy. This exceeds the previous largest catalog of quasars in size by a factor of ten.
机译:我们描述了一种快速的核心判别分析算法,经验展示了以前最好的方法的渐近加速。我们通过存储在分层树中的新的处理数据模式实现这一点,这在可以显示分类结果并使用EPAnechnikov内核的使用,并且使用EPAnechnikov内核的使用,这促使了低开销,这允许在所示数据的部分之间进行额外修剪相距甚远或非常接近彼此。此外,我们的算法可以在多个同时带宽计算之间共享工作,从而促进了一种基本但仍然快速有效的带宽优化手段。我们从斯隆数字天空调查中将我们的算法的并行实施执行了我们的算法(4D 4D中40万分),识别大约一百万个等级,高精度。这超过了以前的四分之一的最大额定目录。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号