首页> 外文会议>International Conference on Intelligent Data Analysis >Improving k-NN for Human Cancer Classification Using the Gene Expression Profiles
【24h】

Improving k-NN for Human Cancer Classification Using the Gene Expression Profiles

机译:使用基因表达谱改善人类癌症分类的K-NN

获取原文
获取外文期刊封面目录资料

摘要

The k Nearest Neighbor classifier has been applied to the identification of cancer samples using the gene expression profiles with encouraging results. However, k-NN relies usually on the use of Euclidean distances that fail often to reflect accurately the sample proximities. Non Euclidean dissimilarities focus on different features of the data and should be integrated in order to reduce the misclassification errors. In this paper, we learn a linear combination of dissimilarities using a regularized kernel alignment algorithm. The weights of the combination are learnt in a HRKHS (Hyper Reproducing Kernel Hilbert Space) using a Semidefinite Programming algorithm. This approach allow us to incorporate a smoothing term that penalizes the complexity of the family of distances and avoids overfitting. The experimental results suggest that the method proposed outperforms other metric learning strategies and improves the classical k-NN algorithm based on a single dissimilarity.
机译:K最近邻分类器已应用于使用基因表达谱的鉴定癌症样品,并令人鼓舞的结果。然而,K-Nn通常依赖于使用经常反射的欧几里德距离来准确地反映样品近距离。非欧几里德异化侧重于数据的不同特征,应该集成,以减少错误分类错误。在本文中,我们使用正则化内核对准算法学习不同异化的线性组合。使用SEMIDEFINITE编程算法在HRKHS(超复制内核HILBERT空间)中学习组合的权重。这种方法允许我们纳入平滑术语,以惩罚距离系列的复杂性并避免过度拟合。实验结果表明该方法提出了优于其他度量学习策略并改善了基于单一相似性的古典k-nn算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号