首页> 外文期刊>ACM transactions on Asian language information processing >NeuMorph: Neural Morphological Tagging for Low-Resource Languages-An Experimental Study for Indie Languages
【24h】

NeuMorph: Neural Morphological Tagging for Low-Resource Languages-An Experimental Study for Indie Languages

机译:NeuMorph:低资源语言的神经形态学标记-独立语言的实验研究

获取原文
获取原文并翻译 | 示例
           

摘要

This article deals with morphological tagging for low-resource languages. For this purpose, five Indic languages are taken as reference. In addition, two severely resource-poor languages, Coptic and Kurmanji, are also considered. The task entails prediction of the morphological tag (case, degree, gender, etc.) of an in-context word. We hypothesize that to predict the tag of a word, considering its longer context such as the entire sentence is not always necessary. In this light, the usefulness of convolution operation is studied resulting in a convolutional neural network (CNN) based morphological tagger. Our proposed model (BLSTM-CNN) achieves insightful results in comparison to the present state-of-the-art. Following the recent trend, the task is carried out under three different settings: single language, across languages, and across keys. Whereas the previous models used only character-level features, we show that the addition of word vectors along with character-level embedding significantly improves the performance of all the models. Since obtaining high-quality word vectors for resource-poor languages remains a challenge, in that scenario, the proposed character-level BLSTM-CNN proves to be most effective.(1)
机译:本文介绍了针对资源匮乏的语言的形态标记。为此,将五种印度语言作为参考。此外,还考虑了两种资源严重匮乏的语言,即科普特语和库尔曼吉语。该任务需要预测上下文中单词的形态标记(大小写,程度,性别等)。我们假设要预测一个单词的标签,考虑到它的较长上下文(例如整个句子)并非总是必要的。鉴于此,研究了卷积运算的有用性,从而得出了基于卷积神经网络(CNN)的形态标记器。与目前的最新技术相比,我们提出的模型(BLSTM-CNN)获得了具有洞察力的结果。遵循最近的趋势,在三种不同的设置下执行任务:单一语言,跨语言和跨键。尽管先前的模型仅使用字符级功能,但我们表明,将单词向量与字符级嵌入一起添加可显着提高所有模型的性能。由于获取资源贫乏的语言的高质量单词向量仍然是一个挑战,因此在这种情况下,所提出的字符级BLSTM-CNN被证明是最有效的。(1)

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号