首页> 外文会议>International conference on Asian language processing >Toward a Standardized and More Accurate Indonesian Part-of-Speech Tagging
【24h】

Toward a Standardized and More Accurate Indonesian Part-of-Speech Tagging

机译:迈向标准化,更准确的印尼语词性标注

获取原文

摘要

Previous work in Indonesian part-of-speech (POS) tagging are hard to compare as they are not evaluated on a common dataset. Furthermore, in spite of the success of neural network models for English POS tagging, they are rarely explored for Indonesian. In this paper, we explored various techniques for Indonesian POS tagging, including rule-based, CRF, and neural network-based models. We evaluated our models on the IDN Tagged Corpus. A new state-of-the-art of 97.47 F1 score is achieved with a recurrent neural network. To provide a standard for future work, we release the dataset split that we used publicly.
机译:印尼语词性(POS)标记中的先前工作很难进行比较,因为它们没有在公共数据集中进行评估。此外,尽管用于英语POS标记的神经网络模型取得了成功,但很少在印度尼西亚使用它们。在本文中,我们探索了印度尼西亚POS标记的各种技术,包括基于规则,CRF和基于神经网络的模型。我们在IDN标记语料库上评估了我们的模型。使用递归神经网络可以达到最新的97.47 F1分数。为了为将来的工作提供标准,我们发布了公开使用的数据集拆分。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号