In this study, we propose a knowledge-independent method for aligning terms and thus extracting translations from a small, domain-specific corpus consisting of parallel English and Chinese court judgments from Hong Kong. With a sentence-aligned corpus, translation equivalences are suggested by analysing the frequency profiles of parallel concordances. The method overcomes the limitations of conventional statistical methods which require large corpora to be effective, and lexical approaches which depend on existing bilingual dictionaries. Pilot testing on a parallel corpus of about 113K Chinese words and 120K English words gives an encouraging 85% precision and 45% recall. Future work includes fine-tuning the algorithm upon the analysis of the errors, and acquiring a translation lexicon for legal terminology by filtering out general terms.
展开▼