首页> 外文期刊>Numerical analysis and applications >Exact Algorithms of Search for a Cluster of the Largest Size in Two Integer 2-Clustering Problems
【24h】

Exact Algorithms of Search for a Cluster of the Largest Size in Two Integer 2-Clustering Problems

机译:在两个整数2聚类问题中搜索最大大小集群的精确算法

获取原文
获取原文并翻译 | 示例
           

摘要

We consider two related discrete optimization problems of searching for a subset in a finite set of points in Euclidean space. Both problems are induced by versions of a fundamental problem in data analysis, namely, that of selecting a subset of similar elements in a set of objects. In each problem, given an input set and a positive real number, it is required to find a cluster (i.e., a subset) of the largest size under constraints on a quadratic clusterization function. The points in the input set, which are outside the sought-for subset, are treated as a second (complementary) cluster. In the first problem, the function under the constraint is the sum over both clusters of the intracluster sums of the squared distances between the elements of the clusters and their centers. The center of the first (i.e., the sought-for) cluster is unknown and determined as a centroid, while the center of the second one is fixed at a given point in Euclidean space (without loss of generality, at the origin of coordinates). In the second problem, the function under the constraint is the sum over both clusters of the weighted intracluster sums of the squared distances between the elements of the clusters and their centers. As in the first problem, the center of the first cluster is unknown and determined as a centroid, while the center of the second one is fixed at the origin of coordinates. In this paper, we show that both problems are strongly NP-hard. Also, we present exact algorithms for the problems in which the input points have integer components. If the space dimension is bounded by some constant, the algorithms are pseudopolynomial.
机译:我们考虑在欧几里德空间中的有限点中搜索子集的两个相关的离散优化问题。通过数据分析中的基本问题的版本诱导了这两个问题,即选择一组对象中的类似元素的子集。在每个问题中,给定输入集和正则数量,需要在二次集群化函数上的约束下找到最大大小的群集(即,子集)。在寻求子集外的输入集中的点被视为第二(互补)群集。在第一个问题中,约束下的功能是在集群和其中心之间的平坦距离之间的间平距离的两个簇的总和。第一(即,寻求的)群集的中心未知并确定为质心,而第二个的中心在欧几里德空间的给定点处固定在欧几里德空间的给定点(不损失坐标的起源) 。在第二个问题中,约束下的功能是在集群和其中心的元件之间的平方距离的加权距离的两个簇的总和。如在第一个问题中,第一簇的中心未知并且被确定为质心,而第二个簇的中心在坐标的起源处固定。在本文中,我们表明这两个问题都很强烈。此外,我们为输入点具有整数组件的问题提出了精确的算法。如果空间尺寸受到一些常数的界限,则算法是伪二极管。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号