...
首页> 外文期刊>Interdisciplinary Sciences: Computational Life Sciences >GAD: A Python Script for Dividing Genome Annotation Files into Feature-Based Files
【24h】

GAD: A Python Script for Dividing Genome Annotation Files into Feature-Based Files

机译:gad:用于将基于特征的文件分成基因组注释文件的Python脚本

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Nowadays, the manipulation and analysis of genomic data stored in publicly accessible repositories have become a daily task in genomics and bioinformatics laboratories. Due to the enormous advancement in the field of genome sequencing and the emergence of many projects, bioinformaticians have pushed for the creation of a variety of programs and pipelines that will automatically analyze such big data, in particular the pipelines of gene annotation. Dealing with annotation files using easy and simple programs is very important, particularly for non-developers, enhancing the genomic data analysis acceleration. One of the first tasks required to work with genomic annotation files is to extract different features. In this regard, we have developed GAD () using Python to be a fast, easy, and controlled script that has a high ability to handle annotation files such as GFF3 and GTF. GAD is a cross-platform graphical interface tool used to extract genome features such as intergenic regions, upstream, and downstream genes. Besides, GAD finds all names of ambiguous sequence ontology, and either extracts them or considers them as genes or transcripts. The results are produced in a variety of file formats, such as BED, GTF, GFF3, and FASTA, supported by other bioinformatics programs. The GAD can handle large sizes of different genomes and an infinite number of files with minimal user effort. Therefore, our script could be integrated into various pipelines in all genomic laboratories to accelerate data analysis.
机译:如今,在公开可访问的存储库中存储的基因组数据的操纵和分析已成为基因组学和生物信息学实验室的日常任务。由于基因组测序领域的巨大进步和许多项目的出现,生物信息管理员已经推动了创建各种程序和管道,这些程序将自动分析如此大数据,特别是基因注释的管道。使用简单而简单的程序处理注释文件非常重要,特别是对于非开发人员来说,增强基因组数据分析加速。使用基因组注释文件所需的第一个任务是提取不同的功能。在这方面,我们使用Python开发了Gad()是一种快速,简单,受控的脚本,具有很高的能力处理GFF3和GTF等注释文件。 GAD是用于提取基因组特征的跨平台图形界面工具,例如基因组,上游和下游基因。此外,GAD发现了含糊不清序列本体的所有名称,要么提取它们,也将它们视为基因或成绩单。结果以各种文件格式生产,例如床,GTF,GFF3和FASTA,由其他生物信息学程序支持。 GAD可以处理大量不同的基因组和无限数量的文件,具有最小的用户努力。因此,我们的脚本可以集成到所有基因组实验室中的各种管道中以加速数据分析。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号