首页> 美国政府科技报告 >Scale-Independent Clustering Method with Automatic Variable Selection Based on Trees

【24h】

Scale-Independent Clustering Method with Automatic Variable Selection Based on Trees

机译：基于树的自动变量选择与尺度无关的聚类方法

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Clustering is the process of putting observations into groups based on their distance, or dissimilarity, from one another. Measuring distance for continuous variables often requires scaling or monotonic transformation. Determining dissimilarity when observations have both continuous and categorical measurements can be difficult because each type of measurement must be approached differently. We introduce a new clustering method that uses one of three new distance metrics. In a dataset with p variables, we create p trees, one with each variable as the response. Distance is measured by determining on which leaf an observation falls in each tree. Two observations are similar if they tend to fall on the same leaf and dissimilar if they are usually on different leaves. The distance metrics are not affected by scaling or transformations of the variables and easily determine distances in datasets with both continuous and categorical variables. This method is tested on several well-known datasets, both with and without added noise variables, and performs very well in the presence of noise due in part to automatic variable selection. The new distance metrics outperform several existing clustering methods in a large number of scenarios.

著录项

作者
Lynch, S K;
展开▼
作者单位

展开▼
年度 2014
页码 1-49
总页数 49
原文格式 PDF
正文语种 eng
中图分类工业技术;
关键词

相似文献

外文文献
中文文献
专利

1. Advanced predictive methods for wine age prediction: Part I – A comparison study of single-block regression approaches based on variable selection, penalized regression, latent variables and tree-based ensemble methods [J] . Ricardo Rendall, Ana Cristina Pereira, Marco S. Reis Talanta: The International Journal of Pure and Applied Analytical Chemistry . 2017,第期

机译：葡萄酒年龄预测的高级预测方法：第I部分 - 基于变量选择，惩罚回归，潜在变量和基于树的集合方法的单块回归方法的比较研究
2. Automatic feature subset selection for decision tree-based ensemble methods in the prediction of bioactivity [J] . Dong-Sheng Cao, Qing-Song Xu, Yi-Zeng Liang, Chemometrics and Intelligent Laboratory Systems . 2010,第2期

机译：基于特征树的集成方法在生物活性预测中的自动特征子集选择
3. Kernel-based hard clustering methods with kernelization of the metric and automatic weighting of the variables [J] . Ferreira Marcelo R. P., de Carvalho Francisco de A. T., Simoes Eduardo C. Pattern Recognition: The Journal of the Pattern Recognition Society . 2016,第Null期

机译：基于内核的硬聚类方法，具有度量的内核化和变量的自动加权
4. A METHOD OF UNIT PRE-SELECTION FOR SPEECH SYNTHESIS BASED ON ACOUSTIC CLUSTERING AND DECISION TREES [C] . IEEE IEEE International Conference on Acoustics, Speech, and Signal Processing . 2003

机译：基于声学聚类和决策树的语音合成的单位预选方法
5. Variable Selection and Decision Trees: The DiVaS and ALoVaS Methods [D] . Roberts, Lucas. 2014

机译：变量选择和决策树：DiVaS和ALoVaS方法
6. Application of Decision Tree Algorithm Based on Clustering and Entropy Method Level Division for Regional Economic Index Selection [O] . Yi Zhang, Gang Yang -1

机译：基于聚类和熵值法的决策树算法在区域经济指标选择中的应用
7. A scale-independent clustering method with automatic variable selection based on trees [O] . Lynch Sarah K. 2014

机译：基于树的自动选择变量的尺度无关聚类方法

Scale-Independent Clustering Method with Automatic Variable Selection Based on Trees

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅