This paper describes the Neural Machine Translation systems of Xiamen University for the Myanmar-English translation tasks of WAT 2018. We apply Unicode normalization, training data filtering, different Myanmar tokenizers, and subword segmentation in data pre-processing. We try to train NMT models with different architectures. The experimental results show that the RNN-based shallow models can still outperform Transformer models in some settings. And we also found that replacing the official Myanmar tokenizer with syllable segmentation does help improve the result.
展开▼