Arabic has a very complex morphological system, though a very structured one. Character patterns are often indicative of word class and word segmentation. to this paper, we explore a novel approach to Arabic word segmentation and part-of-speech tagging relying on character information. The approach is lexicon-free and does not require any morphological analysis, eliminating the factor of dictionary coverage. Using character-based analysis, the developed system yielded state-of-the-art accuracy comparing favourably with other taggers that involve external resources.
展开▼