首页> 外文会议>IEEE International Conference on Tools with Artificial Intelligence >Pruning fuzzy ARTMAP using the minimum description length principle in learning from clinical databases
【24h】

Pruning fuzzy ARTMAP using the minimum description length principle in learning from clinical databases

机译:从临床数据库学习的最小描述长度原理修剪模糊艺术图

获取原文

摘要

The fuzzy ARTMAP is one of the families of neural network architectures based on ART (adaptive resonance theory) in which supervised learning can be carried out. However, it usually tends to create more categories than are actually needed. This often causes the so-called overfitting problem, where the performance of the fuzzy ARTMAP networks in the test set does not increase monotonically with additional training epochs and category creation. In order to avoid the overfitting problem, Carpenter and Tan (1993) proposed a confidence-based pruning method by eliminating those categories that were either less useful or less accurate. This paper proposes yet another alternative pruning method, which is based on the minimal description length (MDL) principle. The MDL principle can be viewed as a tradeoff between theory complexity and data prediction accuracy, given the theory. We adopted Cameron-Jones's (1992) error encoding scheme and Quinlan's (1994, 1995) modification for theory encoding to estimate the fuzzy ARTMAP theory description length. A greedy MDL search algorithm is proposed to prune the fuzzy ARTMAP categories one by one. Experiments showed that a fuzzy ARTMAP pruned with the MDL principle gave a better performance, with far fewer categories created, than the original fuzzy ARTMAP and other machine-learning systems on a number of benchmark clinical databases such as heart disease, breast cancer and diabetes databases.
机译:模糊艺术图是基于技术(自适应共振理论)的神经网络架构的家族之一,其中可以进行监督学习。但是,它通常倾向于创建比实际所需的更多类别。这常常导致所谓的过度装箱问题,其中测试集中的模糊艺术映射网络的性能不会随着额外的训练时期和类别创建而单调地单调。为了避免过度装备问题,木匠和棕褐色(1993)通过消除较少有用或更低准确的类别提出了一种基于信心的修剪方法。本文提出了另一种替代修剪方法,其基于最小描述长度(MDL)原理。考虑到理论,MDL原理可以被视为理论复杂性和数据预测准确性之间的权衡。我们采用Cameron-Jones(1992)错误编码方案和奎南(1994,1995)修改,以估计模糊艺术理论描述长度。提出了一种贪婪的MDL搜索算法,用于将模糊艺术映射类别逐一修剪。实验表明,使用MDL原则修剪的模糊艺术图,比在诸如心脏病,乳腺癌和糖尿病数据库等许多基准临床数据库上的原始模糊艺术图和其他机器学习系统产生了更好的性能。 。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号