首页> 美国卫生研究院文献>Frontiers in Microbiology >The R package otu2ot for implementing the entropy decomposition of nucleotide variation in sequence data
【2h】

The R package otu2ot for implementing the entropy decomposition of nucleotide variation in sequence data

机译:R包otu2ot用于实现序列数据中核苷酸变异的熵分解

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Oligotyping is a novel, supervised computational method that classifies closely related sequences into “oligotypes” (OTs) based on subtle nucleotide variation (Eren et al., ). Its application to microbial datasets has helped reveal ecological patterns which are often hidden by the way sequence data are currently clustered to define operational taxonomic units (OTUs). Here, we implemented the OT entropy decomposition procedure and its unsupervised version, Minimal Entropy Decomposition (MED; Eren et al., ), in the statistical programming language and environment, R. The aim of this implementation is to facilitate the integration of computational routines, interactive statistical analyses, and visualization into a single framework. In addition, two complementary approaches are implemented: (1) An analytical method (the broken stick model) is proposed to help identify OTs of low abundance that could be generated by chance alone and (2) a one-pass profiling (OP) method, to efficiently identify those OTUs whose subsequent oligotyping would be most promising to be undertaken. These enhancements are especially useful for large datasets, where a manual screening of entropy analysis results and the creation of a full set of OTs may not be feasible. The package and procedures are illustrated by several tutorials and examples.
机译:寡核苷酸分型是一种新颖的,有监督的计算方法,可基于细微的核苷酸变异将密切相关的序列分类为“寡核苷酸型”(OTs)(Eren等,)。它在微生物数据集上的应用有助于揭示生态模式,这些生态模式通常被序列数据当前聚类以定义操作分类单位(OTU)的方式所掩盖。在这里,我们在统计编程语言和环境R中实现了OT熵分解过程及其无监督版本,Minimal Entropy Decomposition(MED; Eren et al。,)。此实现的目的是促进计算例程的集成,交互式统计分析和可视化成一个框架。此外,还实施了两种补充方法:(1)提出了一种分析方法(折杆模型)来帮助识别可能仅由偶然机会生成的低丰度OT,以及(2)单次通过(OP)方法,以有效地识别那些最有希望进行后续寡聚化的OTU。这些增强功能对于大型数据集尤其有用,在这种情况下,手动筛选熵分析结果和创建全套OT可能不可行。该软件包和过程由几个教程和示例说明。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号