首页>
外文OA文献
>A Discretization Algorithm of Continuous Attributes Based on Supervised Clustering
【2h】
A Discretization Algorithm of Continuous Attributes Based on Supervised Clustering
展开▼
机译:基于监督聚类的连续属性离散化算法
展开▼
免费
页面导航
摘要
著录项
引文网络
相似文献
相关主题
摘要
Many machine learning algorithms can be applied only to data described by categorical attributes. So discretizatioti of continuous attributes is one of the important steps in preprocessing of extracting knowledge. Traditional discretization algorithms based on clustering need a pre-determined clustering number k, also typically are applied in an unsupervised learning framework. This paper describes such an algorithm, called SX-means (Supervised X-means), which is a new algorithm of supervised discretization of continuous attributes on clustering. The algorithm modifies clusters with knowledge of the class distribution dynamically. And this procedure can not stop until the proper k is found. For the number of clusters k is not pre-determined by the user and class distribution is applied, the random of result is decreased greatly. Experimental evaluation of several discretization algorithms on six artificial data sets show that the proposed algorithm is more efficient and can generate a better discretization schema. Comparing the output of C4.5, resulting tree is smaller, less classification rules, and high accuracy of classification.
展开▼