...
首页> 外文期刊>Algorithms for Molecular Biology >The distance and median problems in the single-cut-or-join model with single-gene duplications
【24h】

The distance and median problems in the single-cut-or-join model with single-gene duplications

机译:单根或连接模型中的距离和中位问题,单基因重复

获取原文
   

获取外文期刊封面封底 >>

       

摘要

In the field of genome rearrangement algorithms, models accounting for gene duplication lead often to hard problems. For example, while computing the pairwise distance is tractable in most duplication-free models, the problem?is NP-complete for most extensions of these models accounting for duplicated genes. Moreover, problems involving more than two genomes, such as the genome median and the Small Parsimony problem, are intractable for most duplication-free models, with some exceptions, for example the Single-Cut-or-Join (SCJ) model. We introduce a variant of the SCJ distance that accounts for duplicated genes, in the context of directed evolution from an ancestral genome to a descendant genome where orthology relations between ancestral genes and their descendant are known. Our model includes two duplication mechanisms: single-gene tandem duplication and the creation of single-gene circular chromosomes. We prove that in this model, computing the directed distance and a parsimonious evolutionary scenario in terms of SCJ and single-gene duplication events can be done in linear time. We also show that the directed median problem is tractable for this distance, while the rooted median problem, where we assume that one of the given genomes is ancestral to the median, is NP-complete. We also describe an Integer Linear Program for solving this problem. We evaluate the directed distance and rooted median algorithms on simulated data. Our results provide a simple genome rearrangement model, extending the SCJ model to account for single-gene duplications, for which we prove a mix of tractability and hardness results. For the NP-complete rooted median problem, we design a simple Integer Linear Program. Our publicly available implementation of these algorithms for the directed distance and median problems allow to solve efficiently these problems on large instances.
机译:在基因组重排算法领域中,模型核算基因重复的模型经常导致难题。例如,在计算成对距离的同时在无大多数重复的模型中进行遗传距离,问题?对于这些模型的大多数扩展来说,对于这些模型的大多数扩展来说,占用的重复基因。此外,涉及两个以上基因组的问题,例如基因组中位数和小的分布问题,是最重复的模型的棘手,具有一些例外,例如单剪辑或加入(SCJ)模型。我们介绍了SCJ距离的变体,该距离是重复基因的,在从祖先基因组到后代基因组的定向进化中,其中已知祖先基因与其后代之间的正式基因组。我们的模型包括两种复制机制:单基因串联复制和创建单基因圆形染色体。我们证明,在该模型中,可以在线性时间内计算在SCJ和单基因复制事件方面的定向距离和解析的进化方案。我们还表明,指示的中位数问题对于这个距离来说是易行的,而生根中位数问题,我们认为给定的基因组中的一个是祖先的,是NP-Tressial。我们还描述了一个用于解决此问题的整数线性程序。我们评估模拟数据上的定向距离和根中值算法。我们的结果提供了一种简单的基因组重新排列模型,扩展了SCJ模型,以考虑单一基因重复,我们证明了一种易易行和硬度结果的混合。对于NP完整的植根中位数问题,我们设计了一个简单的整数线性程序。我们公开实施这些算法的定向距离和中值问题允许在大型情况下有效地解决这些问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号