首页> 外文期刊>BMC Bioinformatics >Principal component analysis for designed experiments
【24h】

Principal component analysis for designed experiments

机译:设计实验的主成分分析

获取原文
           

摘要

Background Principal component analysis is used to summarize matrix data, such as found in transcriptome, proteome or metabolome and medical examinations, into fewer dimensions by fitting the matrix to orthogonal axes. Although this methodology is frequently used in multivariate analyses, it has disadvantages when applied to experimental data. First, the identified principal components have poor generality; since the size and directions of the components are dependent on the particular data set, the components are valid only within the data set. Second, the method is sensitive to experimental noise and bias between sample groups. It cannot reflect the experimental design that is planned to manage the noise and bias; rather, it estimates the same weight and independence to all the samples in the matrix. Third, the resulting components are often difficult to interpret. To address these issues, several options were introduced to the methodology. First, the principal axes were identified using training data sets and shared across experiments. These training data reflect the design of experiments, and their preparation allows noise to be reduced and group bias to be removed. Second, the center of the rotation was determined in accordance with the experimental design. Third, the resulting components were scaled to unify their size unit. Results The effects of these options were observed in microarray experiments, and showed an improvement in the separation of groups and robustness to noise. The range of scaled scores was unaffected by the number of items. Additionally, unknown samples were appropriately classified using pre-arranged axes. Furthermore, these axes well reflected the characteristics of groups in the experiments. As was observed, the scaling of the components and sharing of axes enabled comparisons of the components beyond experiments. The use of training data reduced the effects of noise and bias in the data, facilitating the physical interpretation of the principal axes. Conclusions Together, these introduced options result in improved generality and objectivity of the analytical results. The methodology has thus become more like a set of multiple regression analyses that find independent models that specify each of the axes.
机译:背景技术主成分分析用于通过将矩阵拟合到正交轴来将矩阵数据(如在转录组,蛋白质组或代谢组和医学检查中发现的数据)汇总为较少的维度。尽管这种方法经常用于多元分析,但在应用于实验数据时却有其缺点。首先,所确定的主要成分普遍性较差;由于组件的大小和方向取决于特定的数据集,因此组件仅在数据集中有效。其次,该方法对实验噪声和样品组之间的偏差很敏感。它不能反映计划用来管理噪声和偏差的实验设计;相反,它估计矩阵中所有样本的权重和独立性相同。第三,生成的组件通常难以解释。为了解决这些问题,该方法引入了几种选择。首先,使用训练数据集确定主轴并在实验之间共享。这些训练数据反映了实验的设计,并且其准备工作可以减少噪声并消除组偏差。第二,根据实验设计确定旋转中心。第三,对生成的组件进行缩放以统一其大小单位。结果在微阵列实验中观察到了这些选择的效果,并显示了组分离和抗噪性的改进。标定分数的范围不受项目数量的影响。此外,使用预先安排的轴对未知样品进行了适当分类。此外,这些轴很好地反映了实验中各组的特征。如所观察到的,部件的缩放比例和轴的共享使得能够进行超出实验之外的部件的比较。训练数据的使用减少了数据中的噪声和偏差的影响,有利于主轴的物理解释。结论总之,这些引入的选项可提高分析结果的通用性和客观性。因此,该方法变得更像是一组多元回归分析,可以找到指定每个轴的独立模型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号