首页> 外文期刊>Journal of machine learning research >Generalization from Observed to Unobserved Features by Clustering
【24h】

Generalization from Observed to Unobserved Features by Clustering

机译:通过聚类从可观察特征到不可观察特征的泛化

获取原文
           

摘要

We argue that when objects are characterized by many attributes, clusteringthem on the basis of a random subset of these attributes cancapture information on the unobserved attributes as well. Moreover,we show that under mild technical conditions, clustering the objectson the basis of such a random subset performs almost as well as clusteringwith the full attribute set. We prove finite sample generalizationtheorems for this novel learning scheme that extends analogous resultsfrom the supervised learning setting. We use our framework to analyzegeneralization to unobserved features of two well-known clusteringalgorithms: k-means and the maximum likelihood multinomial mixturemodel. The scheme is demonstrated for collaborative filtering of userswith movie ratings as attributes and document clustering with wordsas attributes. color="gray">
机译:我们认为,当对象具有许多属性时,基于这些属性的随机子集对它们进行聚类也可以捕获未观察到的属性上的信息。此外,我们表明,在温和的技术条件下,基于这种随机子集对对象进行聚类的效果几乎与对整个属性集进行聚类的效果一样。我们证明了这种新颖的学习方案的有限样本泛化定理,该理论扩展了有监督学习环境下的类似结果。我们使用我们的框架来分析广义化到两个众所周知的聚类算法的未观察到的特征: k -均值和最大似然多项式混合模型。演示了该方案用于以电影等级为属性的协作过滤用户和以words为属性的文档聚类。 color =“ gray”>

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号