The derivation is described of a probabilistic grammar for mainsubject field codes from the machine readable version of the LongmanDictionary of Contemporary English (LDOCE) (P. Procter, 1978). Thesecodes are used in the dictionary to mark the subject area to which acertain sense of a word belongs. The grammar consists of the dictionaryitself and a matrix that describes how closely two main subject fieldsare related to each other in a large training corpus of unrestrictedEnglish text
展开▼