首页> 外文期刊>Computers & operations research >Capacitated clustering problem in computational biology: Combinatorial and statistical approach for sibling reconstruction
【24h】

Capacitated clustering problem in computational biology: Combinatorial and statistical approach for sibling reconstruction

机译:计算生物学中的能力集群问题:同构重建的组合和统计方法

获取原文
获取原文并翻译 | 示例
           

摘要

The capacitated clustering problem (CCP) has been studied in a wide range of applications. In this study, we investigate a challenging CCP in computational biology, namely, sibling reconstruction problem (SRP). The goal of SRP is to establish the sibling relationship (i.e., groups of siblings) of a population from genetic data. The SRP has gained more and more interests from computational biologists over the past decade as it is an important and necessary keystone for studies in genetic and population biology. We propose a large-scale mixed-integer formulation of the CCP for SRP that is based on both combinatorial and statistical genetic concepts. The objective is not only to find the minimum number of sibling groups, but also to maximize the degree of similarity of individuals in the same sibling groups while each sibling group is subject to genetic constraints derived from Mendel's laws. We develop a new randomized greedy optimization algorithm to effectively and efficiently solve this SRP. The algorithm consists of two key phases: construction and enhancement. In the construction phase, a greedy approach with randomized perturbation is applied to construct multiple sibling groups iteratively. In the enhancement phase, a two-stage local search with a memory function is used to improve the solution quality with respect to the similarity measure. We demonstrate the effectiveness of the proposed algorithm using real biological data sets and compare it with state-of-the-art approaches in the literature. We also test it on larger simulated data sets. The experimental results show that the proposed algorithm provide the best reconstruction solutions.
机译:容量聚类问题(CCP)已在广泛的应用中进行了研究。在这项研究中,我们研究了计算生物学中具有挑战性的CCP,即同级重建问题(SRP)。 SRP的目标是根据遗传数据建立种群的同胞关系(即同胞组)。在过去的十年中,SRP已成为计算生物学家的越来越多的兴趣,因为它是基因和种群生物学研究的重要和必要的基石。我们提出了基于组合和统计遗传概念的SRP CCP的大规模混合整数公式。目的不仅在于找到最小数量的兄弟姐妹群体,而且要使同一兄弟姐妹群体中个体的相似度最大化,而每个兄弟姐妹群体都受到孟德尔定律的遗传约束。我们开发了一种新的随机贪婪优化算法,以有效地解决此SRP。该算法包括两个关键阶段:构建和增强。在构造阶段,采用具有随机扰动的贪婪方法来迭代构造多个同级组。在增强阶段,使用具有记忆功能的两阶段局部搜索来提高关于相似性度量的解决方案质量。我们使用真实的生物学数据集证明了该算法的有效性,并将其与文献中的最新方法进行了比较。我们还将在更大的模拟数据集上对其进行测试。实验结果表明,该算法提供了最佳的重建方案。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号