Toward a Standardized and More Accurate Indonesian Part-of-Speech Tagging

机译：朝着标准化和更准确的印度尼西亚语标记

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Previous work in Indonesian part-of-speech (POS) tagging are hard to compare as they are not evaluated on a common dataset. Furthermore, in spite of the success of neural network models for English POS tagging, they are rarely explored for Indonesian. In this paper, we explored various techniques for Indonesian POS tagging, including rule-based, CRF, and neural network-based models. We evaluated our models on the IDN Tagged Corpus. A new state-of-the-art of 97.47 F1 score is achieved with a recurrent neural network. To provide a standard for future work, we release the dataset split that we used publicly.

机译：以前在印度尼西亚语音（POS）标记中的工作很难相比，因为它们未在公共数据集上进行评估。此外，尽管英语POS标记的神经网络模型的成功，但很少探索印度尼西亚。在本文中，我们探索了印度尼西亚POS标记的各种技术，包括基于规则的CRF和基于神经网络的模型。我们在IDN标记的语料库上评估了我们的模型。通过经常性神经网络实现了97.47 F1分数的新的最先进。为了为未来的工作提供标准，我们将释放我们公开使用的数据集拆分。

著录项

来源
《International Conference on Asian Language Processing》|2018年|383p|共5页
会议地点
作者
Kemal Kurniawan; Alham Fikri Aji;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP312-53;
关键词
Tagging; Encoding; Artificial neural networks; Training; Natural language processing; Hidden Markov models;

机译：标记;编码;人工神经网络;培训;自然语言处理;隐藏的马尔可夫模型;

相似文献

外文文献
中文文献
专利

1. Towards Accurate and Efficient Chinese Part-of-Speech Tagging [J] . Weiwei Su, Xiaojun Wa Computational linguistics . 2016,第3期

机译：走向准确高效的中文词性标注
2. Domain Adaptation for Part-of-Speech Tagging of Indonesian Text Using Affix Information [J] . Aditya Maulana, Ade Romadhony Procedia Computer Science . 2021,第1期

机译：使用附件信息的印度尼西亚文本的词语标记的域改性
3. Tagging Accuracy Analysis on Part-of-Speech Taggers [J] . Semih Yumusak, Erdogan Dogdu, Halife Kodaz Journal of Computer and Communications . 2014,第4期

机译：词性标注器的标注准确性分析
4. Toward a Standardized and More Accurate Indonesian Part-of-Speech Tagging [C] . Kemal Kurniawan, Alham Fikri Aji International conference on Asian language processing . 2018

机译：迈向标准化，更准确的印尼语词性标注
5. IITagger: Tagging Wall Street Journal text with part-of-speech information [D] . Kim, Yeongkwun 1996

机译：IITagger：使用词性信息标记“华尔街日报”文本
6. A fine-grained Chinese word segmentation and part-of-speech tagging corpus for clinical text [O] . Ying Xiong, Zhongmin Wang, Dehuan Jiang, 2019

机译：用于临床文本的细粒度中文分词和词性标注语料库
7. Toward a Standardized and More Accurate Indonesian Part-of-Speech Tagging [O] . Kemal Kurniawan, Alham Fikri Aji 2018

机译：朝着标准化和更准确的印度尼西亚语标记

Toward a Standardized and More Accurate Indonesian Part-of-Speech Tagging

摘要

著录项

相似文献

相关主题

期刊订阅