Generalization from Observed to Unobserved Features by Clustering

Krupka Eyal; Tishby Naftali

首页> 外文期刊>Journal of machine learning research >Generalization from Observed to Unobserved Features by Clustering

【24h】

Generalization from Observed to Unobserved Features by Clustering

机译：通过聚类从可观察特征到不可观察特征的泛化

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We argue that when objects are characterized by many attributes, clusteringthem on the basis of a random subset of these attributes cancapture information on the unobserved attributes as well. Moreover,we show that under mild technical conditions, clustering the objectson the basis of such a random subset performs almost as well as clusteringwith the full attribute set. We prove finite sample generalizationtheorems for this novel learning scheme that extends analogous resultsfrom the supervised learning setting. We use our framework to analyzegeneralization to unobserved features of two well-known clusteringalgorithms: k-means and the maximum likelihood multinomial mixturemodel. The scheme is demonstrated for collaborative filtering of userswith movie ratings as attributes and document clustering with wordsas attributes. color="gray">

机译：我们认为，当对象具有许多属性时，基于这些属性的随机子集对它们进行聚类也可以捕获未观察到的属性上的信息。此外，我们表明，在温和的技术条件下，基于这种随机子集对对象进行聚类的效果几乎与对整个属性集进行聚类的效果一样。我们证明了这种新颖的学习方案的有限样本泛化定理，该理论扩展了有监督学习环境下的类似结果。我们使用我们的框架来分析广义化到两个众所周知的聚类算法的未观察到的特征： k -均值和最大似然多项式混合模型。演示了该方案用于以电影等级为属性的协作过滤用户和以words为属性的文档聚类。 color =“ gray”>

著录项

来源
《Journal of machine learning research》 |2008年第3期|共32页
作者
Krupka Eyal; Tishby Naftali;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Automated feature weighting in clustering with separable distances and inner product induced norms - A theoretical generalization [J] . Saha Arkajyoti, Das Swagatam Pattern recognition letters . 2015,第octa1期

机译：具有可分离距离和内积诱导范数的聚类中的自动特征加权-理论概括
2. SENSITIVITY ANALYSIS FOR AN UNOBSERVED MODERATOR IN RCT-TO-TARGET-POPULATION GENERALIZATION OF TREATMENT EFFECTS (vol 11, pg 225, 2017) [J] . Trang Quynh Nguyen, Stuart Elizabeth A. The Annals of applied statistics . 2020,第1期

机译：敏感式调节剂在RCT到目标群体概括治疗效果的敏感性分析（Vol 11，PG 225,2017）
3. On Negative Outcome Control of Unobserved Confounding as a Generalization of Difference-in-Differences [J] . Sofer Tamar, Richardson David B., Colicino Elena, Statistical science . 2016,第3期

机译：差异差异的推广：未观察到的混杂的负面结果控制
4. Generalization in Clustering with Unobserved Features [C] . Eyal Krupka, Naftali Tishby Annual Conference on Neural Information Processing Systems . 2006

机译：与未观察特征聚类的概括
5. A Relational Framework for Clustering and Cluster Validity and the Generalization of the Silhouette Measure. [D] . Rawashdeh, Mohammad Y. 2013

机译：聚类和聚类有效性的关系框架以及轮廓测度的推广。
6. On negative outcome control of unobserved confounding as a generalization of difference-in-differences [O] . Tamar Sofer, David B. Richardson, Elena Colicino, -1

机译：关于未观察到的混杂的负面结果控制作为差异差异的概括
7. The Joint Benefits of Observed and Unobserved Punishment: Comment to Unobserved Punishment Supports Cooperation [O] . Glöckner, A., Kube, S., Nicklisch, A. 2011

机译：观察到的和未观察到的惩罚的共同利益：对未观察到的惩罚的评论支持合作

Generalization from Observed to Unobserved Features by Clustering

摘要

著录项

相似文献

相关主题

期刊订阅