Analysis and Application of Normalization Methods with Supervised Feature Weighting to Improve K-means Accuracy

机译：监督专题加权归一化方法的分析与应用，提高k型准确性

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Normalization methods are widely employed for transforming the variables or features of a given dataset. In this paper three classical feature normalization methods, Standardization (St), Min-Max (MM) and Median Absolute Deviation (MAD), are studied in different synthetic datasets from UCI repository. An exhaustive analysis of the transformed features' ranges and their influence on the Euclidean distance is performed, concluding that knowledge about the group structure gathered by each feature is needed to select the best normalization method for a given dataset. In order to effectively collect the features' importance and adjust their contribution, this paper proposes a two-stage methodology for normalization and supervised feature weighting based on a Pearson correlation coefficient and on a Random Forest Feature Importance estimation method. Simulations on five different datasets reveal that our two-stage proposed methodology, in terms of accuracy, outperforms or at least maintains the K-means performance obtained if only normalization is applied.

机译：归一化方法广泛用于转换给定数据集的变量或特征。在本文中，三种经典特征归一化方法，标准化（ST），MIN-MAX（MM）和中位绝对偏差（MM），在UCI存储库的不同合成数据集中研究。对转换特征的范围的详尽分析及其对欧几里德距离的影响，得出结论是需要了解由每个特征收集的组结构的知识来选择给定数据集的最佳标准化方法。为了有效地收集“重要性”的重要性和调整贡献，本文提出了一种基于Pearson相关系数和随机林特征重要性估计方法的标准化和监督特征加权的两级方法。在五个不同的数据集上模拟显示，如果仅应用归一化，我们的两级提出方法在准确性，绩效效果或至少保持k均值的情况下。

著录项

来源
《International Conference on Soft Computing Models in Industrial and Environmental Applications》|2020年|xxi 609 p. :|共11页
会议地点
作者
Iratxe Nino-Adan; Itziar Landa-Torres; Eva Portillo; Diana Manjarres;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-532;
关键词
Normalization; Standardization; Weighted Euclidean Distance; Pearson correlation; Random Forest; K-means;

机译：归一化;标准化;加权欧几里德距离;Pearson相关;随机森林;K-means;

相似文献

外文文献
中文文献
专利

1. An improved semi-supervised dimensionality reduction using feature weighting: Application to sentiment analysis [J] . Kim Kyoungok Expert Systems with Application . 2018,第nova期

机译：使用特征加权的改进的半监督降维：在情感分析中的应用
2. Features of Application of the Laser Method for Normalization of Accuracy of Resistors in Hybrid Integrated Circuits [J] . Y. Antonov Key Engineering Materials . 2005,第期

机译：激光方法在混合集成电路中电阻精度标准化中的应用特点
3. Analysis of the semi-empirical Stark broadening methods to improve the line emission accuracy: applications on He, Ar and Fe thermal plasmas [J] . Hannachi R., Cressault Y., Teulet Ph, Journal of Physics, D. Applied Physics: A Europhysics Journal . 2018,第33期

机译：改善线路排放精度的半经验颗粒拓宽方法分析：对他，AR和Fe热等离子体的应用
4. Analysis and Application of Normalization Methods with Supervised Feature Weighting to Improve K-means Accuracy [C] . Iratxe Nino-Adan, Itziar Landa-Torres, Eva Portillo, International Conference on Soft Computing Models in Industrial and Environmental Applications . 2020

机译：监督专题加权归一化方法的分析与应用，提高k型准确性
5. Model selection and data weighting methods for statistical catch-at-age analysis: Application to 1836 Treaty Water stock assessments. [D] . Linton, Brian C. 2007

机译：用于统计捕捞年龄分析的模型选择和数据加权方法：在1836年条约水存量评估中的应用。
6. Applications of Different Weighting Schemes to Improve Pathway-Based Analysis [O] . Sook S. Ha, Inyoung Kim, Yue Wang, 2011

机译：不同加权方案在基于路径的分析中的应用
7. Active Learning with Efficient Feature Weighting Methods for Improving Data Quality and Classification Accuracy [O] . Justin Martineau, Lu Chen, Doreen Cheng, 2015

机译：主动学习与高效的特征加权方法，以提高数据质量和分类准确性

Analysis and Application of Normalization Methods with Supervised Feature Weighting to Improve K-means Accuracy

摘要

著录项

相似文献

相关主题

期刊订阅