首页> 外文会议>International conference on advances in computing, communications and informatics >An improved fuzzy based approach to impute missing values in DNA microarray gene expression data with collaborative filtering
【24h】

An improved fuzzy based approach to impute missing values in DNA microarray gene expression data with collaborative filtering

机译:一种改进的基于模糊的方法,通过协同过滤来估算DNA微阵列基因表达数据中的缺失值

获取原文

摘要

DNA microarray experiments normally generate gene expression profiles in the form of high dimensional matrices. It may happen that DNA microarray gene expression values contain many missing values within its data due to several reasons like image disruption, hybridization error, dust, moderate resolution etc. It will be very unfortunate if these missing values affect the performance of subsequent statistical and machine learning experiments significantly. There exist various missing value estimation algorithms. In this work we have proposed a modification to the existing imputation approach named as Collaborative Filtering Based on Rough-Set Theory (CFBRST) [10]. This proposed approach (CFBRSTFDV) uses Fuzzy Difference Vector (FDV) along with Rough Set based Collaborative Filtering that analyzes historical interactions and helps to estimate the missing values. This is a suggestion based system that works on the principle of how suggestion of items or products arrive to an individual while using FB, Twitter or looking for books in Amazon. We have applied our proposed algorithm on two benchmark dataset SPELLMAN & Tumor Cell (GDS2932) and the experiments show that the modified approach, CFBRSTFDV, outperforms the other existing state-of-the art methods as far as RMSE measures are concerned, particularly when we increase the number of missing values.
机译:DNA微阵列实验通常会生成高维矩阵形式的基因表达谱。由于一些原因,例如图像破坏,杂交错误,灰尘,中等分辨率等,DNA微阵列基因表达值可能会在其数据中包含许多缺失值。如果这些缺失值影响后续统计和机器的性能,将是非常不幸的学习实验明显。存在各种缺失值估计算法。在这项工作中,我们提出了一种对基于粗糙集理论(CFBRST)的名为“协同过滤”的现有插补方法的修改[10]。此提议的方法(CFBRSTFDV)使用模糊差向量(FDV)以及基于粗糙集的协作过滤,可以分析历史交互并帮助估计缺失值。这是一个基于建议的系统,其原理是在使用FB,Twitter或在亚马逊上查找书籍时,如何将商品或产品的建议传递给个人。我们在两个基准数据集SPELLMAN和肿瘤细胞(GDS2932)上应用了我们提出的算法,实验表明,就RMSE度量而言,改进的方法CFBRSTFDV优于其他现有的最新方法,尤其是当我们增加缺失值的数量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号