首页> 美国卫生研究院文献>Frontiers in Bioengineering and Biotechnology >Correcting Inconsistencies and Errors in Bacterial Genome Metadata Using an Automated Curation Tool in Excel (AutoCurE)
【2h】

Correcting Inconsistencies and Errors in Bacterial Genome Metadata Using an Automated Curation Tool in Excel (AutoCurE)

机译:使用Excel中的自动管理工具纠正细菌基因组元数据中的不一致和错误(AutoCurE)

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Whole-genome data are invaluable for large-scale comparative genomic studies. Current sequencing technologies have made it feasible to sequence entire bacterial genomes with relative ease and time with a substantially reduced cost per nucleotide, hence cost per genome. More than 3,000 bacterial genomes have been sequenced and are available at the finished status. Publically available genomes can be readily downloaded; however, there are challenges to verify the specific supporting data contained within the download and to identify errors and inconsistencies that may be present within the organizational data content and metadata. AutoCurE, an automated tool for bacterial genome database curation in Excel, was developed to facilitate local database curation of supporting data that accompany downloaded genomes from the National Center for Biotechnology Information. AutoCurE provides an automated approach to curate local genomic databases by flagging inconsistencies or errors by comparing the downloaded supporting data to the genome reports to verify genome name, RefSeq accession numbers, the presence of archaea, BioProject/UIDs, and sequence file descriptions. Flags are generated for nine metadata fields if there are inconsistencies between the downloaded genomes and genomes reports and if erroneous or missing data are evident. AutoCurE is an easy-to-use tool for local database curation for large-scale genome data prior to downstream analyses.
机译:全基因组数据对于大规模的比较基因组研究而言是无价的。当前的测序技术使得以相对容易和相对的时间对整个细菌基因组进行测序成为可能,同时每个核苷酸的成本大大降低,因此每个基因组的成本大大降低。已经对3,000多个细菌基因组进行了测序,并且可以在最终状态下使用。公开提供的基因组可以很容易地下载。但是,要验证下载内容中包含的特定支持数据并确定组织数据内容和元数据中可能存在的错误和不一致,将面临挑战。开发AutoCurE是Excel中用于细菌基因组数据库管理的自动化工具,旨在促进本地数据库对支持数据的本地数据库管理,这些支持数据是从美国国家生物技术信息中心下载的基因组。通过将下载的支持数据与基因组报告进行比较以验证基因组名称,RefSeq登录号,古细菌的存在,BioProject / UID和序列文件描述,通过标记不一致或错误,AutoCurE提供了一种自动方法来管理本地基因组数据库。如果下载的基因组和基因组报告之间存在不一致,并且错误或丢失的数据很明显,则会为9个元数据字段生成标记。 AutoCurE是一种易于使用的工具,用于在进行下游分析之前针对大规模基因组数据进行本地数据库管理。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号