Research in automatic Part of Speech (POS) tagging has been dominated by Markov Model (MM) taggers. Brill [1, 3, 6], has recently described a transformation-based system with comparable accuracy, and simpler algorithms and representation than MM taggers. We present a set-based formal model of natural language ambiguity and semantic tagging that forms a basis for the generalisation of the transformation-based learning (TBL) and Brill's TBL tagger [3]. We discuss empirical observations of the training algorithm that suggest a new evolutionary transformation learning strategy may dramatically improve learning time without loss of accuracy.
展开▼