首页> 美国卫生研究院文献>Applied and Environmental Microbiology >GET_HOMOLOGUES a Versatile Software Package for Scalable and Robust Microbial Pangenome Analysis
【2h】

GET_HOMOLOGUES a Versatile Software Package for Scalable and Robust Microbial Pangenome Analysis

机译:GET_HOMOLOGUES用于可扩展且健壮的微生物全基因组分析的多功能软件包

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

GET_HOMOLOGUES is an open-source software package that builds on popular orthology-calling approaches making highly customizable and detailed pangenome analyses of microorganisms accessible to nonbioinformaticians. It can cluster homologous gene families using the bidirectional best-hit, COGtriangles, or OrthoMCL clustering algorithms. Clustering stringency can be adjusted by scanning the domain composition of proteins using the HMMER3 package, by imposing desired pairwise alignment coverage cutoffs, or by selecting only syntenic genes. The resulting homologous gene families can be made even more robust by computing consensus clusters from those generated by any combination of the clustering algorithms and filtering criteria. Auxiliary scripts make the construction, interrogation, and graphical display of core genome and pangenome sets easy to perform. Exponential and binomial mixture models can be fitted to the data to estimate theoretical core genome and pangenome sizes, and high-quality graphics can be generated. Furthermore, pangenome trees can be easily computed and basic comparative genomics performed to identify lineage-specific genes or gene family expansions. The software is designed to take advantage of modern multiprocessor personal computers as well as computer clusters to parallelize time-consuming tasks. To demonstrate some of these capabilities, we survey a set of 50 Streptococcus genomes annotated in the Orthologous Matrix (OMA) browser as a benchmark case. The package can be downloaded at and .
机译:GET_HOMOLOGUES是一个开放源代码的软件包,它基于流行的正交调用方法,使非生物信息学家可以对微生物进行高度可定制且详细的全基因组分析。它可以使用双向最佳命中,COGtriangles或OrthoMCL聚类算法对同源基因家族进行聚类。可以通过使用HMMER3程序包扫描蛋白质的域组成,施加所需的成对比对覆盖范围临界值或仅选择同义基因来调节聚簇严格性。通过从由聚类算法和过滤标准的任何组合生成的序列中计算共有簇,可以使所得同源基因家族变得更加强大。辅助脚本使核心基因组和全基因组的构建,审讯和图形显示变得容易执行。可以将指数和二项式混合模型拟合到数据中,以估计理论核心基因组和全基因组大小,并可以生成高质量的图形。此外,可以轻松地计算全基因组树并执行基本的比较基因组学,以鉴定谱系特异性基因或基因家族扩展。该软件旨在利用现代多处理器个人计算机以及计算机集群来并行化耗时的任务。为了证明其中一些功能,我们调查了在直系同源矩阵(OMA)浏览器中注释的一组50个链球菌基因组,作为基准案例。该软件包可以在和下载。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号