We implemented versions of the SVM appropriate for one-class classificationin the context of information retrieval. The experiments were conducted onthe standard Reuters data set. For the SVM implementation we used both a version of Schoelkopf et al.and a somewhat different version of one-classSVM based on identifying "outlier" data as representative of the second-class.We report on experiments with different kernels for both of these implementations and with different representations of the data, includingbinary vectors, tf-idf representation and a modification called "Hadamard"representation.Then we compared it with one-class versions of the algorithmsprototype (Rocchio), nearest neighbor, naive Bayes,and finally a natural one-class neural network classification method based on "bottleneck" compression generated filters.The SVM approach as represented by Schoelkopf was superior to all the methods except the neural network one, where it was, althoughoccasionally worse, essentially comparable. However, the SVM methodsturned out to be quite sensitive to the choice of representation andkernel in ways which are not well understood; therefore, for the time beingleaving the neural network approach as the most robust.
展开▼