首页> 中文期刊> 《模式识别与人工智能》 >基因表达数据的低秩投影最小二乘回归子空间分割

基因表达数据的低秩投影最小二乘回归子空间分割

     

摘要

The traditional clustering methods are inefficient due to high dimension and redundancy, small sample size and noise of the gene expression data.Subspace segmentation is an effective method for high dimensional data clustering.However, the performance of clustering is reduced by using subspace segmentation on the gene expression data directly.To cluster the gene expression data more effectively, low rank projection least square regression subspace segmentation method(LPLSR) is proposed.The improved low rank method is utilized to project gene expression data into the latent subspace to remove the possible corruptions in data and get a relatively clean data dictionary.Then, least square regression method is employed to obtain the low-dimension representation for data vectors and the affinity matrix is constructed to cluster the gene data.The experimental results on six public gene expression datasets show the validity of the proposed method.%基因表达数据具有高维、小样本、多噪声和高冗余的特点,使传统聚类方法效率较低.子空间分割是高维数据聚类的有效手段,但直接对基因表达数据进行子空间分割会降低聚类性能.为了更有效地聚类,文中提出低秩投影最小二乘回归子空间分割方法.首先利用改进的低秩方法将数据投影至潜在子空间,以便去除数据中可能的毁损,得到较干净的数据字典.然后采用最小二乘回归方法获得数据低维表示并构造仿射矩阵,利用该仿射矩阵实现聚类.在6个公开基因表达数据集上的实验表明文中方法的有效性.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号