首页> 外文期刊>BMC Bioinformatics >GOMCL: a toolkit to cluster, evaluate, and extract non-redundant associations of Gene Ontology-based functions
【24h】

GOMCL: a toolkit to cluster, evaluate, and extract non-redundant associations of Gene Ontology-based functions

机译:GOMCL:群集,评估和提取基于基于基于基因本体的非冗余关联的工具包

获取原文
           

摘要

Functional enrichment of genes and pathways based on Gene Ontology (GO) has been widely used to describe the results of various -omics analyses. GO terms statistically overrepresented within a set of a large number of genes are typically used to describe the main functional attributes of the gene set. However, these lists of overrepresented GO terms are often too large and contains redundant overlapping GO terms hindering informative functional interpretations. We developed GOMCL to reduce redundancy and summarize lists of GO terms effectively and informatively. This lightweight python toolkit efficiently identifies clusters within a list of GO terms using the Markov Clustering (MCL) algorithm, based on the overlap of gene members between GO terms. GOMCL facilitates biological interpretation of a large number of GO terms by condensing them into GO clusters representing non-overlapping functional themes. It enables visualizing GO clusters as a heatmap, networks based on either overlap of members or hierarchy among GO terms, and tables with depth and cluster information for each GO term. Each GO cluster generated by GOMCL can be evaluated and further divided into non-overlapping sub-clusters using the GOMCL-sub module. The outputs from both GOMCL and GOMCL-sub can be imported to Cytoscape for additional visualization effects. GOMCL is a convenient toolkit to cluster, evaluate, and extract non-redundant associations of Gene Ontology-based functions. GOMCL helps researchers to reduce time spent on manual curation of large lists of GO terms, minimize biases introduced by redundant GO terms in data interpretation, and batch processing of multiple GO enrichment datasets. A user guide, a test dataset, and the source code of GOMCL are available at https://github.com/Guannan-Wang/GOMCL and www.lsugenomics.org.
机译:基于基因本体学(GO)的基因和途径的功能性富集已被广泛用于描述各种 - MOTICS分析的结果。 GO术语在一组大量基因内统计上过度呈现,通常用于描述基因集的主要功能属性。但是,这些超级逗留的GO条款的列表通常太大,并且包含冗余重叠的GO术语阻碍了信息性的功能解释。我们开发了GOMCL,减少了有效和信息地减少了冗余并总结了GO条款列表。此轻量级Python Toolkit使用Markov聚类(MCL)算法基于GO条款之间的基因成员重叠,有效地识别GO术语列表中的群集。 GOMCL通过将它们缩小到代表非重叠功能主题的群集来促进大量GO术语的生物解释。它使得能够以GO条款之间的成员或层次结构的重叠,以及具有每个GO术语的深度和群集信息的表来可视化群集。可以使用GOMCL生成的每个GO群集,并使用GOMCL-SUB模块进一步划分为非重叠的子集群。 GOMCL和GOMCL-SUB的输出可以导入到Cytoscape以进行额外的可视化效果。 GOMCL是一个方便的工具包,用于群集,评估和提取基于基于基于基因本体的功能的非冗余关联。 GOMCL帮助研究人员减少花费大量GO术语清单所花费的时间,最大限度地减少数据解释中冗余GO术语术语引入的偏差,以及多个GO浓缩数据集的批处理。 Https://github.com/guannan-wang/gomcl和www.lsugenomics.org,可以使用用户指南,测试数据集和GOMCL的源代码。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号