首页> 外文会议>Proceedings of the 2015 39th National Systems Conference >A sequential cosine similarity based feature selection technique for high dimensional datasets
【24h】

A sequential cosine similarity based feature selection technique for high dimensional datasets

机译:高维数据集的基于连续余弦相似度的特征选择技术

获取原文
获取原文并翻译 | 示例

摘要

Due to day to day use of information processing in society, the size of the databases has become tremendously high. It has been realized that most of the times, all parameters (called features precisely here) are not required to decide the outcome (or decision) of an instance. Therefore feature selection is an important step in data processing. In this paper, a novel method is presented to select features. In the method, cosine similarity of individual feature of the database with the respective class is computed and kept in an array in descending order. The first feature of this array is combined with rest of the features sequentially one by one. If the classification accuracy of the combination of features increases then the combination is accepted otherwise the responsible features are eliminated from the combination. In this manner all features are tested and a final subset of features is obtained. The results obtained after rigorous experiments on the proposed method on high dimensional databases and comparing with other methods reported so far are encouraging. It is therefore recommended that the proposed method can be applied for high dimensional data processing.
机译:由于社会上信息处理的日常使用,数据库的规模变得非常大。已经认识到,在大多数情况下,不需要所有参数(在这里精确地称为功能)来决定实例的结果(或决定)。因此,特征选择是数据处理中的重要步骤。本文提出了一种新颖的特征选择方法。在该方法中,计算数据库的各个特征与相应类别的余弦相似度,并按降序排列在数组中。此数组的第一个功能与其余功能按顺序依次组合。如果特征组合的分类精度提高,则接受该组合,否则从组合中删除负责任的特征。以这种方式测试所有特征,并获得特征的最终子集。经过在高维数据库上对该方法进行严格的实验,并与迄今为止报道的其他方法进行比较,所获得的结果令人鼓舞。因此,建议将建议的方法应用于高维数据处理。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号