首页> 美国卫生研究院文献>other >When Whole-Genome Alignments Just Wont Work: kSNP v2 Software for Alignment-Free SNP Discovery and Phylogenetics of Hundreds of Microbial Genomes
【2h】

When Whole-Genome Alignments Just Wont Work: kSNP v2 Software for Alignment-Free SNP Discovery and Phylogenetics of Hundreds of Microbial Genomes

机译:当全基因组比对方法不起作用时:kSNP v2软件可实现无比对的SNP发现和数百种微生物基因组的系统发育

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Effective use of rapid and inexpensive whole genome sequencing for microbes requires fast, memory efficient bioinformatics tools for sequence comparison. The kSNP v2 software finds single nucleotide polymorphisms (SNPs) in whole genome data. kSNP v2 has numerous improvements over kSNP v1 including SNP gene annotation; better scaling for draft genomes available as assembled contigs or raw, unassembled reads; a tool to identify the optimal value of k; distribution of packages of executables for Linux and Mac OS X for ease of installation and user-friendly use; and a detailed User Guide. SNP discovery is based on k-mer analysis, and requires no multiple sequence alignment or the selection of a single reference genome. Most target sets with hundreds of genomes complete in minutes to hours. SNP phylogenies are built by maximum likelihood, parsimony, and distance, based on all SNPs, only core SNPs, or SNPs present in some intermediate user-specified fraction of targets. The SNP-based trees that result are consistent with known taxonomy. kSNP v2 can handle many gigabases of sequence in a single run, and if one or more annotated genomes are included in the target set, SNPs are annotated with protein coding and other information (UTRs, etc.) from Genbank file(s). We demonstrate application of kSNP v2 on sets of viral and bacterial genomes, and discuss in detail analysis of a set of 68 finished E. coli and Shigella genomes and a set of the same genomes to which have been added 47 assemblies and four “raw read” genomes of H104:H4 strains from the recent European E. coli outbreak that resulted in both bloody diarrhea and hemolytic uremic syndrome (HUS), and caused at least 50 deaths.
机译:有效且快速地使用廉价的微生物全基因组测序需要快速,高效存储的生物信息学工具进行序列比较。 kSNP v2软件可在整个基因组数据中找到单核苷酸多态性(SNP)。 kSNP v2对kSNP v1进行了许多改进,包括SNP基因注释。更好地缩放草图基因组,以组装重叠群或原始,未组装读段形式提供;确定k最佳值的工具;分发Linux和Mac OS X的可执行程序包,以简化安装和用户友好使用;以及详细的《用户指南》。 SNP发现基于k-mer分析,不需要多重序列比对或单个参考基因组的选择。具有数百个基因组的大多数靶标集可在数分钟至数小时内完成。基于所有SNP,仅核心SNP或存在于目标用户指定的某些中间部分中的SNP,SNP系统发育是通过最大似然,简约和距离建立的。生成的基于SNP的树与已知分类法一致。 kSNP v2可以在一次运行中处理许多千兆位序列,如果目标集中包含一个或多个带注释的基因组,则用来自Genbank文件的蛋白质编码和其他信息(UTR等)对SNP进行注释。我们展示了kSNP v2在一组病毒和细菌基因组上的应用,并详细讨论了一组68个完成的大肠杆菌和志贺氏菌基因组以及一组相同的基因组,其中添加了47个程序集和四个“原始阅读”来自最近的欧洲大肠杆菌爆发的H104:H4菌株的基因组,导致血液性腹泻和溶血性尿毒症综合征(HUS),并导致至少50人死亡。

著录项

  • 期刊名称 other
  • 作者

    Shea N. Gardner; Barry G. Hall;

  • 作者单位
  • 年(卷),期 -1(8),12
  • 年度 -1
  • 页码 e81760
  • 总页数 12
  • 原文格式 PDF
  • 正文语种
  • 中图分类
  • 关键词

  • 入库时间 2022-08-21 11:20:15

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号