首页> 外文会议>Machine learning(ML95) >Compression-Based Discretization of Continuous Attributes
【24h】

Compression-Based Discretization of Continuous Attributes

机译:基于压缩的连续属性离散化

获取原文
获取原文并翻译 | 示例

摘要

Discretization of continuous attributes into ordered discrete attributes can be beneficial even for propositional induction algorithms that are capable of handling continuous attributes directly. Benefits include possibly large improvements in induction time, smaller sizes of induced trees or rule sets, and even improved predictive accuracy. We define a global evaluation measure for discretizations based on the so-called Minimum Description Length (MDL) principle from information theory. Furthermore we describe the efficient algorithmic usage of this measure in the MDL-D_(ISC) algorithm. The new method solves some problems of alternative local measures used for discretization. Empirical results in a few natural domains and extensive experiments in an artificial domain show that MDL-D_(ISC) scales up well to large learning problems involving noise.
机译:即使对于能够直接处理连续属性的命题归纳算法,将连续属性离散化为有序离散属性也可能是有益的。好处包括可能大大缩短了归纳时间,减小了诱导树或规则集的大小,甚至提高了预测准确性。我们根据信息理论中的所谓最小描述长度(MDL)原理,为离散化定义了一种全局评估方法。此外,我们在MDL-D_(ISC)算法中描述了此度量的有效算法用法。新方法解决了用于离散化的替代局部度量的一些问题。在一些自然域中的经验结果以及在人工域中的大量实验表明,MDL-D_(ISC)可以很好地扩展到涉及噪声的大型学习问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号