Our paper introduces a new way to filter spam using as background the Kolmogorov complexity theory and as learning component a Support Vector Machine. Our idea is to skip the classical text analysis in use with standard filtering techniques, and to focus on the measure of the informative content of a message to classify it as spam or legitimate. Exploiting the fact that we can estimate a message information content through compression techniques, we represent an e-mail as a multi-dimensional real vector and we train a Support Vector Machine to get a classifier achieving accuracy rates in the range of 90%-97%, bringing our combined technique at the top of the current spam filtering technologies.
展开▼