We address the problem of filtering medical news articles for targeted audiences. The approach is based on terms and one of the difficulties is extracting a feature set appropriate for the domain. This paper addresses the medical news-filtering problem using a machine learning approach. We describe the application of two supervised machine learning techniques, Decision Trees and Naïve Bayes, to automatically construct classifiers on the basis of a training set, in which news articles have been pre-classified by a medical expert and four other human readers. The goal is to classify the news articles into three groups: non-medical, medical intended for experts, and medical intended for other readers. While the general accuracy of the machine learning approach is around 78%, the accuracy of distinguishing non-medical articles from medical ones is shown to be 92%.
展开▼