【24h】

Projected clustering with subset selection

机译:带有子集选择的投影聚类

获取原文

摘要

It has always been a major challenge to cluster high dimensional data considering the inherent sparsity of data-points. Our model uses attribute selection and handles the sparse structure of the data effectively. The subset section is done by two different methods. In first method, we select the subset which has most informative attributes that do preserve cluster structure using LASSO (Least Absolute Selection and Shrinkage Operator). Though there are other methods for attribute selection, LASSO has distinctive properties that it selects the most correlated set of attributes of the data. In second method, we select the subset of linearly independent attributes using QR factorization. This model also identifies dominant attributes of each cluster which retain their predictive power as well. The quality of the projected clusters formed, is also assured with the use of LASSO.
机译:考虑到数据点固有的稀疏性,对高维数据进行聚类一直是一项重大挑战。我们的模型使用属性选择并有效地处理数据的稀疏结构。子集部分通过两种不同的方法完成。在第一种方法中,我们选择具有最丰富信息属性的子集,并使用LASSO(最小绝对选择和收缩算子)来保留簇结构。尽管还有其他选择属性的方法,但是LASSO具有与众不同的属性,它可以选择最相关的数据属性集。在第二种方法中,我们使用QR分解选择线性独立属性的子集。该模型还确定了每个聚类的主要属性,这些属性也保留了其预测能力。使用LASSO也可以确保形成的预计集群的质量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号