首页> 外文会议>International Conference on Asian Language Processing >Toward a Standardized and More Accurate Indonesian Part-of-Speech Tagging
【24h】

Toward a Standardized and More Accurate Indonesian Part-of-Speech Tagging

机译:朝着标准化和更准确的印度尼西亚语标记

获取原文

摘要

Previous work in Indonesian part-of-speech (POS) tagging are hard to compare as they are not evaluated on a common dataset. Furthermore, in spite of the success of neural network models for English POS tagging, they are rarely explored for Indonesian. In this paper, we explored various techniques for Indonesian POS tagging, including rule-based, CRF, and neural network-based models. We evaluated our models on the IDN Tagged Corpus. A new state-of-the-art of 97.47 F1 score is achieved with a recurrent neural network. To provide a standard for future work, we release the dataset split that we used publicly.
机译:以前在印度尼西亚语音(POS)标记中的工作很难相比,因为它们未在公共数据集上进行评估。此外,尽管英语POS标记的神经网络模型的成功,但很少探索印度尼西亚。在本文中,我们探索了印度尼西亚POS标记的各种技术,包括基于规则的CRF和基于神经网络的模型。我们在IDN标记的语料库上评估了我们的模型。通过经常性神经网络实现了97.47 F1分数的新的最先进。为了为未来的工作提供标准,我们将释放我们公开使用的数据集拆分。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号