We derive new margin-based inequalities for the probability of errorof classifiers. The main feature of these bounds is that they can becalculated using the training data and therefore may be effectivelyused for model selection purposes. In particular, the bounds involveempirical complexities measured on the training data (such as theempirical fat-shattering dimension) as opposed to their worst-casecounterparts traditionally used in such analyses. Also, our boundsappear to be sharper and more general than recent results involvingempirical complexity measures. In addition, we develop analternative data-based bound for the generalization error of classesof convex combinations of classifiers involving an empiricalcomplexity measure that is easier to compute than the empiricalcovering number or fat-shattering dimension. We also show examples ofefficient computation of the new bounds.
展开▼