首页> 美国卫生研究院文献>Briefings in Bioinformatics >Principal component analysis based methods in bioinformatics studies
【2h】

Principal component analysis based methods in bioinformatics studies

机译:生物信息学研究中基于主成分分析的方法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

In analysis of bioinformatics data, a unique challenge arises from the high dimensionality of measurements. Without loss of generality, we use genomic study with gene expression measurements as a representative example but note that analysis techniques discussed in this article are also applicable to other types of bioinformatics studies. Principal component analysis (PCA) is a classic dimension reduction approach. It constructs linear combinations of gene expressions, called principal components (PCs). The PCs are orthogonal to each other, can effectively explain variation of gene expressions, and may have a much lower dimensionality. PCA is computationally simple and can be realized using many existing software packages. This article consists of the following parts. First, we review the standard PCA technique and their applications in bioinformatics data analysis. Second, we describe recent ‘non-standard’ applications of PCA, including accommodating interactions among genes, pathways and network modules and conducting PCA with estimating equations as opposed to gene expressions. Third, we introduce several recently proposed PCA-based techniques, including the supervised PCA, sparse PCA and functional PCA. The supervised PCA and sparse PCA have been shown to have better empirical performance than the standard PCA. The functional PCA can analyze time-course gene expression data. Last, we raise the awareness of several critical but unsolved problems related to PCA. The goal of this article is to make bioinformatics researchers aware of the PCA technique and more importantly its most recent development, so that this simple yet effective dimension reduction technique can be better employed in bioinformatics data analysis.
机译:在生物信息学数据分析中,测量的高维度带来了独特的挑战。在不失一般性的前提下,我们将基因表达研究与基因表达测量作为代表,但是请注意,本文中讨论的分析技术也适用于其他类型的生物信息学研究。主成分分析(PCA)是经典的降维方法。它构建基因表达的线性组合,称为主成分(PC)。 PC彼此正交,可以有效地解释基因表达的变化,并且尺寸可能低得多。 PCA计算简单,可以使用许多现有软件包来实现。本文包括以下部分。首先,我们回顾了标准PCA技术及其在生物信息学数据分析中的应用。其次,我们描述PCA的最新“非标准”应用,包括适应基因,途径和网络模块之间的相互作用,并通过估计方程式而不是基因表达进行PCA。第三,我们介绍一些最近提出的基于PCA的技术,包括受监督的PCA,稀疏PCA和功能PCA。已证明,受监督的PCA和稀疏PCA具有比标准PCA更好的经验性能。功能性PCA可以分析时程基因表达数据。最后,我们提高了与PCA相关的几个关键但尚未解决的问题的意识。本文的目的是使生物信息学研究人员了解PCA技术,更重要的是使PCA技术成为最新技术,以便可以将此简单而有效的降维技术更好地应用于生物信息学数据分析。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号