We show that the intelligent use of one smallpiece of contextual information–a document’spublication date–can improve the performanceof classifiers trained on a text categorizationtask. We focus on academic research documents,where the date of publication undoubtedlyhas an effect on an author’s choice ofwords. To exploit this contextual feature, wepropose the technique of temporal feature modification,which takes various sources of lexicalchange into account, including changes interm frequency, associative strength betweenterms and categories, and dynamic categorizationsystems. We present results of classificationexperiments using both full text papersand abstracts of conference proceedings, showingimproved classification accuracy across thewhole collection, with performance increasesof greater than 40% when temporal featuresare exploited. The technique is fast, classifierindependent,and works well even when makingonly a few modifications.
展开▼