The quality of words representation has an important impact on natural language processing tasks. Aiming at the problems in the current Chinese word representation method: the training data set is huge, the model quality depends on the data set, and the model stability is poor, a word representation method based on the glyph of Chinese character, Glyph2Vec, is proposed. Taking full advantage of the semantic information contained in Chinese characters, a glyph auto-encoder is constructed based on a convolutional auto-encoder. The glyph auto-encoder is used to obtain Chinese character embedding by mapping the glyph of Chinese character in the potential low-dimensional semantic space. In the Chinese named entity recognition task experiment, Glyph2Vec improves the accuracy to F1 score by 0.77%, 1.84%, and 1.31% respectively, compared with Word2Vec. The experimental results show that the method proposed is better than the existing results, which proves the effectiveness of this method.
展开▼