In this paper, a novel method is proposed to solve the problem of insufficient representation of single character-level features or word-level features. In view of the short length, sparseness and strong context dependencies of short text, our method takes word-level vectors and character-level vectors as inputs simultaneously, and encodes sentence semantics by two Long Short-Term Memory (LSTMs) or bidirectional Long Short-Term Memory (BiLSTMs). The outputs of the entire sentence combined two outputs from word-level vectors and character-level vectors. For Chinese short text classification, our experiments show that the combination of word embedding and character embedding can complement each other in the sentence semantic representation, which helps to improve the classification performance of Chinese short text.
展开▼