首页> 外文学位 >A new measure of classifiability and its applications.
【24h】

A new measure of classifiability and its applications.

机译:一种新的可分类性度量方法及其应用。

获取原文
获取原文并翻译 | 示例

摘要

Characterizing the difficulty of a pattern classification problem is an open and challenging problem in machine learning. While some progress has been made in understanding the difficulty of learning a concept (as in the PAC learning frame work), the more pertinent and challenging problem of characterizing the difficulty of a problem given a specific and finite sample has not been addressed.; In this dissertation we develop a new measure of classifiability, motivated in part by the fact that a n-dimensional classification problem may be visualized in (n + 1) dimensions using the class label as the (n + 1)th dimension. In such a visualization, the class label provides a surface which is smooth in regions where classes are non-interlaced and rough in regions where classes are interlaced. The texture of the “class label surface” thus provides an intuitive measure of pattern classifiability. We establish Bayes-sense optimality of the proposed measure of classifiability and present some experimental results based on a simple algorithm to compute the proposed classifiability measure.; The new classifiability measure can be used broadly in solving classification problems since it not only considers the number of pattern instances of different classes (purity) at current situation, but also the spatial distribution of these instances to estimate the effect of further classification. In this dissertation, we develop new approaches for crisp and fuzzy decision tree induction, decision pre-pruning as well as feature subset selection based on the classifiability measure. The proposed algorithms outperform existing algorithms on several standard testing datasets as well as on a real world problem: evaluating skin condition.
机译:表征模式分类问题的难度是机器学习中一个开放且具有挑战性的问题。虽然在理解概念的难度方面已经取得了一些进展(例如在PAC学习框架中),但是在给定特定且有限的样本的情况下,表征问题难度的更相关和更具挑战性的问题尚未得到解决。在本文中,我们开发了一种新的可分类性度量,部分原因是可以使用( n + 1)维将 n 维分类问题可视化。类标签为( n + 1) th 维。在这样的可视化中,类别标签提供了一个表面,该类别在非交错类的区域是光滑的,而在交错类的区域是粗糙的。因此,“分类标签表面”的纹理提供了图案可分类性的直观度量。我们建立了拟议的可分类性度量的贝叶斯最优性,并基于一种简单的算法来计算拟议的可分类性度量,给出了一些实验结果。新的可分类性度量可以广泛用于解决分类问题,因为它不仅考虑当前情况下不同类别(纯度)的模式实例的数量,而且还考虑这些实例的空间分布以估计进一步分类的效果。本文研究了基于可分类性度量的清晰模糊决策树归纳,决策预修剪以及特征子集选择的新方法。所提出的算法在几个标准测试数据集以及一个现实问题(评估皮肤状况)上都优于现有算法。

著录项

  • 作者

    Dong, Ming.;

  • 作者单位

    University of Cincinnati.;

  • 授予单位 University of Cincinnati.;
  • 学科 Engineering Electronics and Electrical.
  • 学位 Ph.D.
  • 年度 2001
  • 页码 87 p.
  • 总页数 87
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 无线电电子学、电信技术;
  • 关键词

  • 入库时间 2022-08-17 11:47:08

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号