This paper introduces an efficient analyser forthe Chinese language, which efficiently andeffectively integrates word segmentation,part-of-speech tagging, partial parsing and fullparsing. The Chinese efficient analyser is basedon a Hidden Markov Model (HMM) and anHMM-based tagger. That is, all thecomponents are based on the sameHMM-based tagging engine. One advantage ofusing the same single engine is that it largelydecreases the code size and makes themaintenance easy. Another advantage is that itis easy to optimise the code and thus improvethe speed while speed plays a critical importantrole in many applications. Finally, theperformances of all the components can benefitfrom the optimisation of existing algorithmsand/or adoption of better algorithms to a singleengine. Experiments show that all thecomponents can achieve state-of-artperformances with high efficiency for theChinese language.
展开▼