This paper describes our system in the Chinese spelling check (CSC) task of CLP-SIGHAN Bake-Off 2014. CSC is still an open problem today. To the best of our knowledge, n-gram language modeling (LM) is widely used in CSC because of its simplicity and fair predictive power. Our work in this paper continues this general line of research by using a tri-gram LM to detect and correct possible spelling errors. In addition, we use dynamic programming to improve the efficiency of the algorithm, and additive smoothing to solve the data sparseness problem in training set. Empirical evaluation results demonstrate the utility of our CSC system.
展开▼