A Fast Approximation Algorithm for the k Partition-Distance Problem

机译：k分割距离问题的快速近似算法

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Given a set of elements N, a partition consists on dividing the set of elements into two or more disjoint clusters that cover all elements. A cluster contains a non-empty subset of elements. The number of clusters of a partition is less than or equal to N. Different partitioning algorithms for the same application will produce different partitions from the same set of elements. To compute the distance and find the consensus partition (also called as consensus clustering) between two or more partitions are important and interesting problems that arise in many applications such as bioinformatics and data mining. However, different distance functions between two or more partitions will usually need to be computed by different algorithms. In this paper, we discuss the k partition-distance problem which can be applied in bioinformatics. Given a set of elements N with k partitions, the k partition-distance problem is to delete the minimum number of elements from each partition such that all remaining partitions become identical. However, this problem has been shown to be NP-complete when k > 2. We will present the first known approximation algorithm with performance ratio 2 to solve this problem in O(k * p * |N|) time, where p is the maximum number of clusters of these k partitions. Then we perform our algorithm for simulation of random data and actual the set of organisms based on DNA markers. It will show that our approximation solution is at most twice the partition-distance of the optimal solution in practice.

机译：给定一组元素N，分区包括将一组元素划分为两个或多个覆盖所有元素的不相交簇。集群包含元素的非空子集。一个分区的簇数小于或等于N。针对同一应用程序的不同分区算法将根据同一组元素生成不同的分区。计算距离并找到两个或多个分区之间的共识分区（也称为共识聚类）是重要且有趣的问题，这些问题在许多应用程序中出现，例如生物信息学和数据挖掘。但是，通常需要通过不同的算法来计算两个或多个分区之间的距离函数。在本文中，我们讨论了可应用于生物信息学的k分区距离问题。给定一组具有k个分区的元素N，第k个分区距离问题是从每个分区中删除最少数量的元素，以使所有其余分区都相同。然而，当k> 2时，这个问题已经证明是NP完全的。我们将提出第一个已知的性能比为2的近似算法，以在O（k * p * | N |）的时间内解决此问题，其中p是这k个分区的最大群集数。然后，我们执行我们的算法，以模拟随机数据并基于DNA标记实际设置一组生物。这将表明我们的近似解在实践中最多是最佳解的分区距离的两倍。

著录项

来源
《Computational science and its applications-ICCSA 2009》|2009年|537-546|共10页
会议地点 Seoul(KR);Seoul(KR)
作者
Yen Hung Chen;
展开▼
作者单位

Department of Information Science and Management Systems, National Taitung University, 684, Seel, Chunghua Rd., Taitung 950, Taiwan, R.O.C.;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
bioinformatics; data mining; approximation algorithm; partitioning; clustering; partition-distance; consensus clustering;

机译：生物信息学数据挖掘;近似算法分区集群分区距离共识聚类;

相似文献

外文文献
中文文献
专利

1. Faster Algorithms for Multivariate Interpolation With Multiplicities and Simultaneous Polynomial Approximations [J] . Chowdhury Muhammad F. /I/., Jeannerod Claude-Pierre, Neiger Vincent, Information Theory, IEEE Transactions on . 2015,第5期

机译：具有多重多项式和同时多项式逼近的多元插值的更快算法
2. Sampling and multilevel coarsening algorithms for fast matrix approximations [J] . Ubaru Shashanka, Saad Yousef Numerical linear algebra with applications . 2019,第3期

机译：用于快速矩阵近似的采样和多级粗化算法
3. Fast Algorithms and Architectures for 8-Point DST-II/DST-VII Approximations [J] . Cintra Renato J., Bayer Fabio M., Madanayake Arjuna, Journal of circuits, systems and computers . 2017,第3期

机译：8点DST-II / DST-VII近似的快速算法和体系结构
4. A Fast Approximation Algorithm for the k Partition-Distance Problem [C] . Yen Hung Chen International Conference on Computational Science and Its Applications . 2009

机译：k分区距离问题的快速近似算法
5. Fast Approximation Algorithms for Positive Linear Programs [D] . Wang, Di. 2017

机译：正线性程序的快速逼近算法
6. New Fast Arctangent Approximation Algorithm for Generic Real-Time Embedded Applications [O] . Mohieddine Benammar, Abdulrahman Alassi, Adel Gastli, 2019

机译：通用实时嵌入式应用的新快速反正切近似算法
7. Faster Algorithms for Multivariate Interpolation with Multiplicities and Simultaneous Polynomial Approximations [O] . Chowdhury, Muhammad Foizul Islam, Jeannerod, Claude-Pierre, Neiger, Vincent, 2015

机译：具有多重和同时多项式逼近的多元插值的更快算法

A Fast Approximation Algorithm for the k Partition-Distance Problem

摘要

著录项

相似文献

相关主题

期刊订阅