首页> 外文期刊>Bioinformatics >An analysis of gene-finding programs for Neurospora crassa.
【24h】

An analysis of gene-finding programs for Neurospora crassa.

机译:克鲁氏神经孢菌基因发现程序的分析。

获取原文
获取原文并翻译 | 示例
       

摘要

MOTIVATION: Computational gene identification plays an important role in genome projects. The approaches used in gene identification programs are often tuned to one particular organism, and accuracy for one organism or class of organism does not necessarily translate to accurate predictions for other organisms. In this paper we evaluate five computer programs on their ability to locate coding regions and to predict gene structure in Neurospora crassa. One of these programs (FFG) was designed specifically for gene-finding in N.crassa, but the model parameters have not yet been fully 'tuned', and the program should thus be viewed as an initial prototype. The other four programs were neither designed nor tuned for N.crassa. RESULTS: We describe the data sets on which the experiments were performed, the approaches employed by the five algorithms: GenScan, HMMGene, GeneMark, Pombe and FFG, the methodology of our evaluation, and the results of the experiments. Our results show that, while none of the programs consistently performs well, overall the GenScan program has the best performance on sensitivity and Missing Exons (ME) while the HMMGene and FFG programs have good performance in locating the exons roughly. Additional work motivated by this study includes the creation of a tool for the automated evaluation of gene-finding programs, the collection of larger and more reliable data sets for N.crassa, parameterization of the model used in FFG to produce a more accurate gene-finding program for this species, and a more in-depth evaluation of the reasons that existing programs generally fail for N.crassa. AVAILABILITY: Data sets, the FFG program source code, and links to the other programs analyzed are available at http://jerry.cs.uga.edu/~wang/genefind.html. CONTACT: eileen
机译:动机:计算基因鉴定在基因组计划中起着重要作用。基因识别程序中使用的方法通常针对一种特定的生物体,一种生物体或一类生物体的准确性不一定会转化为其他生物体的准确预测。在本文中,我们评估了五种计算机程序的定位编码区和预测神经孢菌中基因结构的能力。这些程序之一(FFG)是专为N.crassa中的基因发现而设计的,但尚未完全“调整”模型参数,因此应将其视为初始原型。其他四个程序均未针对N.crassa设计或调整。结果:我们描述了进行实验的数据集,GenScan,HMMGene,GeneMark,Pombe和FFG这五种算法所采用的方法,我们的评估方法以及实验结果。我们的结果表明,尽管没有一个程序能够始终如一地执行良好,但总体而言,GenScan程序在灵敏度和外显子缺失(ME)方面表现最佳,而HMMGene和FFG程序在大致定位外显子方面表现良好。这项研究推动的其他工作包括:创建用于自动评估基因发现程序的工具,收集更大更可靠的N.crassa数据集,对FFG中使用的模型进行参数化以产生更准确的基因-寻找该物种的程序,并更深入地评估现有程序通常对N.crassa失败的原因。可用性:数据集,FFG程序源代码以及所分析的其他程序的链接可在http://jerry.cs.uga.edu/~wang/genefind.html上获得。联系人:eileen

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号