We performed three series of experiments: In a first batch on the mutagenesis data, we evaluated the sensitivity of the method on variations of the parameters and determined default settings. In particular, it turned out that the performance with p set to one is consistently worse than with p > 1. This is an indication that many different structural features contribute equally to the performance of the classifier. Another finding is that the performance does not degrade as more and more rules are added. In other words, overfitting does not seem to occur too easily. In a second batch of experiments on seven small molecule datasets, we showed that margin-based rule learning performs favorably compared to margin-based ILP approaches using kernels. In our third batch, variants of propositionaliza-tion and relational learning are tested on the task of bioavailability prediction. To investigate the "feature efficiency" of those variants, we plot the training set and test set accuracies against the number of rules added.In summary, we propose relational rule learning based on margins. The new approach optimizes the mean margin minus its variance. Error bounds can be derived to obtain a theoretically sound stopping criterion. Overall, MMV optimization seems to be a useful new learning scheme that can be adapted to various data types via plug-ins, and can be adjusted to the noise level via parameters. As the optimization is linear in the number of instances, it should also scale up well for the analysis of larger datasets.
展开▼