首页> 外文会议>World Congress on Engineering >A Fuzzy Approach to Clustering and Selecting Features for Classification of Gene Expression Data
【24h】

A Fuzzy Approach to Clustering and Selecting Features for Classification of Gene Expression Data

机译:基因表达数据分类的聚类和选择特征的模糊方法

获取原文

摘要

Classification assigns a discrete value named label to each sample in a dataset with respect to its feature values. In this research, we aim to consider some datasets which contain a few samples whereas a huge amount of features are provided for each sample. Most of biological datasets such as micro-arrays has this property. A fundamental contribution of this article is a major extension of pervious works for crisp data clustering. The new approach is based on fuzzy feature clustering which is utilized to select the best features (genes). The proposed method has two advantages over the crisp method. Firstly, it leads to more stability and faster convergence; secondly, it improves the accuracy of the classifier using the selected features. Moreover, in this paper a novel method has been proposed for the discretization of continuous data using the Fisher criterion. In addition, a new method for initialization of cluster centers is suggested. The proposed method has achieved a considerable improvement compared with the crisp version. The leukemia dataset has been used to illustrate the effectiveness of the method.
机译:分类将名为标签的离散值分配给数据集中的每个样本,相对于其特征值。在这项研究中,我们的目的是考虑一些包含少量样本的数据集,而每个样品提供大量特征。大多数生物数据集如微阵列具有此属性。本文的根本贡献是脆性数据聚类的普遍延伸。新方法基于模糊特征群集,用于选择最佳特征(基因)。该方法具有优于清晰度的方法。首先,它导致更稳定和更快的收敛;其次,它使用所选功能提高了分类器的准确性。此外,在本文中,已经提出了一种新的方法,用于使用Fisher标准离散数据的离散数据。此外,提出了一种初始化集群中心的新方法。与清脆版本相比,该方法实现了相当大的改进。白血病数据集已用于说明方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号