...
首页> 外文期刊>BMC Bioinformatics >Efficient algorithms for reconstructing gene content by co-evolution
【24h】

Efficient algorithms for reconstructing gene content by co-evolution

机译:高效算法通过共同输出重建基因含量

获取原文

摘要

BackgroundIn a previous study we demonstrated that co-evolutionary information can be utilized for improving the accuracy of ancestral gene content reconstruction. To this end, we defined a new computational problem, the Ancestral Co-Evolutionary (ACE) problem, and developed algorithms for solving it.ResultsIn the current paper we generalize our previous study in various ways. First, we describe new efficient computational approaches for solving the ACE problem. The new approaches are based on reductions to classical methods such as linear programming relaxation, quadratic programming, and min-cut. Second, we report new computational hardness results related to the ACE, including practical cases where it can be solved in polynomial time.Third, we generalize the ACE problem and demonstrate how our approach can be used for inferring parts of the genomes of non-ancestral organisms. To this end, we describe a heuristic for finding the portion of the genome ('dominant set’) that can be used to reconstruct the rest of the genome with the lowest error rate. This heuristic utilizes both evolutionary information and co-evolutionary information.We implemented these algorithms on a large input of the ACE problem (95 unicellular organisms, 4,873 protein families, and 10, 576 of co-evolutionary relations), demonstrating that some of these algorithms can outperform the algorithm used in our previous study. In addition, we show that based on our approach a ’dominant set’ cab be used reconstruct a major fraction of a genome (up to 79%) with relatively low error-rate (e.g. 0.11). We find that the ’dominant set’ tends to include metabolic and regulatory genes, with high evolutionary rate, and low protein abundance and number of protein-protein interactions.ConclusionsThe ACE problem can be efficiently extended for inferring the genomes of organisms that exist today. In addition, it may be solved in polynomial time in many practical cases. Metabolic and regulatory genes were found to be the most important groups of genes necessary for reconstructing gene content of an organism based on other related genomes.
机译:背景技术我们证明了共同进化信息可用于提高祖先基因含量重建的准确性。为此,我们定义了一个新的计算问题,祖先共同进化(ACE)问题,以及用于解决它的算法。本文以各种方式概括我们以前的研究概括了我们以前的研究。首先,我们描述了解决ACE问题的新高效计算方法。新方法基于减少对古典方法,如线性编程松弛,二次编程和敏感。其次,我们报告了与ACE相关的新的计算硬度结果,包括可以在多项式时间中解决的实际情况。第三,我们概括了ACE问题,并展示了我们的方法如何用于推断非祖先的基因组的推断生物。为此,我们描述了一种寻找基因组('主导集合')的一部分的启发式,其可用于以最低误码率重建基因组的其余部分。这种启发式利用进化信息和共同进化信息。我们在ace问题的大输入上实施了这些算法(95个单细胞生物,4,873个蛋白质家族和10,576,共同进化关系的10,576个),展示了一些这些算法可以优于我们以前的研究中使用的算法。此外,我们表明,根据我们的方法,使用“主导集合”驾驶室以相对较低的差速率(例如0.11)重建基因组(高达79%)的主要部分。我们发现“主导集合”倾向于包括代谢和调节基因,具有高进化率,以及低蛋白质丰度和蛋白质 - 蛋白质相互作用的数量。能够有效地延伸到推断今天存在的生物的基因组。此外,在许多实际情况下,它可以在多项式时间中解决。发现代谢和调节基因是基于其他相关基因组重建生物体的基因含量所需的最重要基因组。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号