【24h】

Margin-Based First-Order Rule Learning

机译:基于保证金的一阶规则学习

获取原文
获取原文并翻译 | 示例

摘要

We performed three series of experiments: In a first batch on the mutagenesis data, we evaluated the sensitivity of the method on variations of the parameters and determined default settings. In particular, it turned out that the performance with p set to one is consistently worse than with p > 1. This is an indication that many different structural features contribute equally to the performance of the classifier. Another finding is that the performance does not degrade as more and more rules are added. In other words, overfitting does not seem to occur too easily. In a second batch of experiments on seven small molecule datasets, we showed that margin-based rule learning performs favorably compared to margin-based ILP approaches using kernels. In our third batch, variants of propositionaliza-tion and relational learning are tested on the task of bioavailability prediction. To investigate the "feature efficiency" of those variants, we plot the training set and test set accuracies against the number of rules added.In summary, we propose relational rule learning based on margins. The new approach optimizes the mean margin minus its variance. Error bounds can be derived to obtain a theoretically sound stopping criterion. Overall, MMV optimization seems to be a useful new learning scheme that can be adapted to various data types via plug-ins, and can be adjusted to the noise level via parameters. As the optimization is linear in the number of instances, it should also scale up well for the analysis of larger datasets.
机译:我们进行了三个系列的实验:在第一批诱变数据中,我们评估了该方法对参数变化和确定默认设置的敏感性。尤其是,事实证明,将p设置为1时的性能始终比p> 1时差。这表明许多不同的结构特征均对分类器的性能做出了相同的贡献。另一个发现是,随着添加越来越多的规则,性能不会降低。换句话说,过度拟合似乎不太容易发生。在针对七个小分子数据集的第二批实验中,我们显示了与使用内核的基于边距的ILP方法相比,基于边距的规则学习表现良好。在我们的第三批中,对命题和关系学习的变体进行了生物利用度预测任务的测试。为了研究这些变体的“特征效率”,我们针对添加的规则数量绘制训练集和测试集的准确性。总之,我们建议基于边距的关系规则学习。新方法优化了平均余量减去方差。可以导出误差范围以获得理论上合理的停止标准。总体而言,MMV优化似乎是一种有用的新学习方案,可以通过插件适应各种数据类型,并可以通过参数将其调整为噪声水平。由于优化在实例数量上是线性的,因此对于较大的数据集的分析,它也应该很好地扩展。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号