...
首页> 外文期刊>Nature Communications >Sequence element enrichment analysis to determine the genetic basis of bacterial phenotypes
【24h】

Sequence element enrichment analysis to determine the genetic basis of bacterial phenotypes

机译:序列元素富集分析以确定细菌表型的遗传基础

获取原文

摘要

Bacterial genomes vary extensively in terms of both gene content and gene sequence. This plasticity hampers the use of traditional SNP-based methods for identifying all genetic associations with phenotypic variation. Here we introduce a computationally scalable and widely applicable statistical method (SEER) for the identification of sequence elements that are significantly enriched in a phenotype of interest. SEER is applicable to tens of thousands of genomes by counting variable-length k-mers using a distributed string-mining algorithm. Robust options are provided for association analysis that also correct for the clonal population structure of bacteria. Using large collections of genomes of the major human pathogens Streptococcus pneumoniae and Streptococcus pyogenes , SEER identifies relevant previously characterized resistance determinants for several antibiotics and discovers potential novel factors related to the invasiveness of S. pyogenes . We thus demonstrate that our method can answer important biologically and medically relevant questions.
机译:细菌基因组在基因含量和基因序列方面差异很大。这种可塑性阻碍了使用传统的基于SNP的方法来鉴定具有表型变异的所有遗传关联。在这里,我们介绍了一种计算可扩展且可广泛应用的统计方法(SEER),用于识别在目标表型中显着富集的序列元素。通过使用分布式字符串挖掘算法对可变长度k-mers进行计数,SEER可应用于成千上万的基因组。为关联分析提供了强大的选项,这些选项还可以纠正细菌的克隆种群结构。利用大量的主要人类病原体的基因组集合,SEER鉴定了先前对几种抗生素表征的相关抗性决定因素,并发现了与化脓性链球菌侵袭有关的潜在新因素。因此,我们证明了我们的方法可以回答重要的生物学和医学相关问题。

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号