首页> 外国专利> Rule induction on large noisy data sets

Rule induction on large noisy data sets

机译:大噪声数据集的规则归纳

摘要

Efficient techniques for inducing rules used in classifying data items on a noisy data set. The prior-art IREP technique, which produces a set of classification rules by inducing each rule and then pruning it and continuing thus until a stopping condition is reached, is improved with a new rule-value metric for stopping pruning and with a stopping condition which depends on the description length of the rule set. The rule set which results from the improved IREP technique is then optimized by pruning rules from the set to minimize the description length and further optimized by making a replacement rule and a modified rule for each rule and using the description length to determine whether to use the replacement rule, the modified rule, or the original rule in the rule set. Further improvement is achieved by inducing rules for data items not covered by the original set and then pruning these rules. Still further improvement is gained by repeating the steps of inducing rules for data items not covered, pruning the rules, optimizing the rules, and again pruning for a fixed number of times. The fully-developed technique has the O(nlog.sup.2 n) running time characteristic of IREP, but produces rule sets which do a substantially better job of classification than those produced by IREP.
机译:用于归纳规则的有效技术,该规则用于对嘈杂数据集上的数据项进行分类。现有技术的IREP技术通过引入新的规则值度量来停止修剪,并通过停止每个条件来生成一组分类规则,该分类规则是通过诱导每个规则然后对其进行修剪然后继续直到达到停止条件来生成的。取决于规则集的描述长度。然后,通过修剪规则集中的规则以使描述长度最小化,来优化源自改进的IREP技术的规则集,并通过为每个规则制定替换规则和修改后的规则并使用描述长度来确定是否使用替换规则,修改后的规则或规则集中的原始规则。通过为原始集未涵盖的数据项引入规则,然后修剪这些规则,可以实现进一步的改进。通过重复针对未涵盖的数据项引入规则,修剪规则,优化规则并再次修剪固定次数的步骤,可以获得进一步的改进。完全开发的技术具有IREP的O(nlog.sup.2 n)运行时间特征,但是产生的规则集比IREP产生的规则集做得更好。

著录项

  • 公开/公告号US5719692A

    专利类型

  • 公开/公告日1998-02-17

    原文格式PDF

  • 申请/专利权人 LUCENT TECHNOLOGIES INC.;

    申请/专利号US19950499247

  • 发明设计人 WILLIAM W. COHEN;

    申请日1995-07-07

  • 分类号G06F17/00;G06F15/00;

  • 国家 US

  • 入库时间 2022-08-22 02:40:08

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号