首页> 美国卫生研究院文献>other >Restoring the Duality between Principal Components of a Distance Matrix and Linear Combinations of Predictors with Application to Studies of the Microbiome
【2h】

Restoring the Duality between Principal Components of a Distance Matrix and Linear Combinations of Predictors with Application to Studies of the Microbiome

机译:恢复距离矩阵主成分与预测变量线性组合之间的对偶关系并应用于微生物组研究

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Appreciation of the importance of the microbiome is increasing, as sequencing technology has made it possible to ascertain the microbial content of a variety of samples. Studies that sequence the 16S rRNA gene, ubiquitous in and nearly exclusive to bacteria, have proliferated in the medical literature. After sequences are binned into operational taxonomic units (OTUs) or species, data from these studies are summarized in a data matrix with the observed counts from each OTU for each sample. Analysis often reduces these data further to a matrix of pairwise distances or dissimilarities; plotting the first two or three principal components (PCs) of this distance matrix often reveals meaningful groupings in the data. However, once the distance matrix is calculated, it is no longer clear which OTUs or species are important to the observed clustering; further, the PCs are hard to interpret and cannot be calculated for subsequent observations. We show how to construct approximate decompositions of the data matrix that pair PCs with linear combinations of OTU or species frequencies, and show how these decompositions can be used to construct biplots, select important OTUs and partition the variability in the data matrix into contributions corresponding to PCs of an arbitrary distance or dissimilarity matrix. To illustrate our approach, we conduct an analysis of the bacteria found in 45 smokeless tobacco samples.
机译:随着测序技术使得确定各种样品的微生物含量成为可能,微生物组的重要性越来越高。在医学文献中,对16S rRNA基因进行测序的研究在细菌中普遍存在,并且几乎是细菌所独有的。将序列归类为可操作的分类单位(OTU)或物种后,这些研究的数据将汇总到一个数据矩阵中,其中包含每个样品的每个OTU观察到的计数。分析通常将这些数据进一步简化为成对的距离或相似性矩阵。绘制此距离矩阵的前两个或三个主成分(PC),通常会显示出有意义的数据分组。但是,一旦计算了距离矩阵,就不再清楚哪个OTU或种类对观察到的聚类很重要;此外,PC难以解释,无法为后续观察而计算。我们将展示如何构造将PC与OTU或物种频率的线性组合配对的数据矩阵的近似分解,并展示如何将这些分解用于构建双图,选择重要的OTU并将数据矩阵中的可变性划分为与具有任意距离或相似度矩阵的PC。为了说明我们的方法,我们对45种无烟烟草样品中发现的细菌进行了分析。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号