首页> 美国卫生研究院文献>other >High-dimensional cluster analysis with the Masked EM Algorithm
【2h】

High-dimensional cluster analysis with the Masked EM Algorithm

机译:使用Masked EM算法进行高维聚类分析

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Cluster analysis faces two problems in high dimensions: first, the “curse of dimensionality” that can lead to overfitting and poor generalization performance; and second, the sheer time taken for conventional algorithms to process large amounts of high-dimensional data. We describe a solution to these problems, designed for the application of “spike sorting” for next-generation high channel-count neural probes. In this problem, only a small subset of features provide information about the cluster member-ship of any one data vector, but this informative feature subset is not the same for all data points, rendering classical feature selection ineffective. We introduce a “Masked EM” algorithm that allows accurate and time-efficient clustering of up to millions of points in thousands of dimensions. We demonstrate its applicability to synthetic data, and to real-world high-channel-count spike sorting data.
机译:聚类分析在高维度上面临两个问题:首先,“维度诅咒”会导致过度拟合和较差的泛化性能;第二,传统算法处理大量高维数据所花费的时间。我们描述了针对这些问题的解决方案,该解决方案旨在为下一代高通道数神经探针应用“尖峰排序”。在这个问题中,只有很小的特征子集提供有关任何一个数据向量的群集成员关系的信息,但是对于所有数据点而言,这个信息丰富的特征子集并不相同,从而导致经典特征选择无效。我们引入了“蒙版EM”算法,该算法可对数千个维度中的数百万个点进行准确,高效的聚类。我们展示了其对合成数据和现实世界中高通道数尖峰排序数据的适用性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号