This paper describes how a simple novel Galois Power-of-Two (GPOW2) real-time embedding scheme is used to improve the performance and accuracy of downstream NLP tasks. GPOW2 computes embeddings live on the fly (real time) in the context of target NLP tasks without the need for tabulated pre-embeddings. One excellent feature of the method is the ability to capture multilevel embeddings in the same pass. It simultaneously computes character, word and sentence embeddings on the fly. GPOW2 has been derived in the context of attempts to improve the performance of the SWAM Arabic morphological engine, which is a multipurpose tool that supports segmentation, classification, POS tagging, spell checking, word embeddings, sematic search, among other tasks. SWAM is a pattern-oriented algorithm that relies on morphological patterns and POS tagging to perform NLP tasks. The paper demonstrates how GPOW2 led to improvements in the accuracy of POS tagging and pattern matching, and accordingly the performance of the whole engine. The accuracy for pattern prediction is 99.47% and is 98.80% for POS tagging.
展开▼