This paper describes the application and analysis of a previously developed textual emotion classification system (READ-BioMed-EC) on a different data set in the same language with different textual properties. The classifier makes use of a number of lexicon-based and text-based features. The data set originally used to develop this classifier consisted of English-language Twitter microblogs with mentions of Ebola disease. The data was manually labelled with one of six emotion classes, plus sarcasm, news-related, or neutral. In this new work, we applied the READ-BioMed-EC emotion classifier without retraining to an independently collected set of Web blog posts, also annotated with emotion classes, to understand how well the Twitter-trained disease-focused emotion classifier might generalise to an entirely different collection of opendomain sentences. The results of our study show that direct cross-genre application of the classifier does not achieve meaningful results, but when re-trained on the open-domain data set, the READ-BioMed-EC system outperforms the previously published results. The study has implications for cross-genre applicability of emotion classifiers, demonstrating that emotion is expressed differently in different text types.
展开▼