Chinese word segmentation is an important and necessary problem to analyze Chinese texts. In this paper, we focus on the primary challenges inChineseword segmentation: lowaccuracy of out-of-vocabulary word. To resolve this difficult problems, we group the “similar” characters to generate more abstract representation. Experimental results show that character abstraction yields a significant relative error reduction of 24.83% in average over the state-of-the-art baseline.
展开▼