首页> 外文会议>13th European Conference on Machine Learning, Aug 19-23, 2002, Helsinki, Finland >Towards a Simple Clustering Criterion Based on Minimum Length Encoding
【24h】

Towards a Simple Clustering Criterion Based on Minimum Length Encoding

机译:基于最小长度编码的简单聚类判据

获取原文
获取原文并翻译 | 示例

摘要

We propose a simple and intuitive clustering evaluation criterion based on the minimum description length principle which yields a particularly simple way of describing and encoding a set of examples. The basic idea is to view a clustering as a restriction of the attribute domains, given an example's cluster membership. As a special operational case we develop the so-called rectangular uniform message length measure that can be used to evaluate clusterings described as sets of hyper-rectangles. We theoretically prove that this measure punishes cluster boundaries in regions of uniform instance distribution (i.e., unintuitive clusterings), and we experimentally compare a simple clustering algorithm using this measure with the well-known algorithms KMeans and AutoClass.
机译:我们基于最小描述长度原则提出了一种简单直观的聚类评估标准,该准则产生了一种描述和编码示例集的特别简单的方法。给出示例的集群成员资格,基本思想是将集群视为对属性域的限制。作为一种特殊的操作案例,我们开发了所谓的矩形统一消息长度度量,该度量可用于评估被描述为超矩形集的聚类。我们从理论上证明了该措施可以惩罚均匀实例分布区域中的群集边界(即,非直觉群集),并通过实验将使用此措施的简单群集算法与知名算法KMeans和AutoClass进行比较。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号