Toward a Standardized and More Accurate Indonesian Part-of-Speech Tagging

机译：迈向标准化，更准确的印尼语词性标注

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Previous work in Indonesian part-of-speech (POS) tagging are hard to compare as they are not evaluated on a common dataset. Furthermore, in spite of the success of neural network models for English POS tagging, they are rarely explored for Indonesian. In this paper, we explored various techniques for Indonesian POS tagging, including rule-based, CRF, and neural network-based models. We evaluated our models on the IDN Tagged Corpus. A new state-of-the-art of 97.47 F1 score is achieved with a recurrent neural network. To provide a standard for future work, we release the dataset split that we used publicly.

机译：印尼语词性（POS）标记中的先前工作很难进行比较，因为它们没有在公共数据集中进行评估。此外，尽管用于英语POS标记的神经网络模型取得了成功，但很少在印度尼西亚使用它们。在本文中，我们探索了印度尼西亚POS标记的各种技术，包括基于规则，CRF和基于神经网络的模型。我们在IDN标记语料库上评估了我们的模型。使用递归神经网络可以达到最新的97.47 F1分数。为了为将来的工作提供标准，我们发布了公开使用的数据集拆分。

著录项

来源
《International conference on Asian language processing》|2018年|303-307|共5页
会议地点
作者
Kemal Kurniawan; Alham Fikri Aji;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Tagging; Encoding; Artificial neural networks; Training; Natural language processing; Hidden Markov models;

机译：标记;编码;人工神经网络;训练;自然语言处理;隐马尔可夫模型;

相似文献

外文文献
中文文献
专利

1. Towards Accurate and Efficient Chinese Part-of-Speech Tagging [J] . Weiwei Su, Xiaojun Wa Computational linguistics . 2016,第3期

机译：走向准确高效的中文词性标注
2. Domain Adaptation for Part-of-Speech Tagging of Indonesian Text Using Affix Information [J] . Aditya Maulana, Ade Romadhony Procedia Computer Science . 2021,第1期

机译：使用附件信息的印度尼西亚文本的词语标记的域改性
3. Tagging Accuracy Analysis on Part-of-Speech Taggers [J] . Semih Yumusak, Erdogan Dogdu, Halife Kodaz Journal of Computer and Communications . 2014,第4期

机译：词性标注器的标注准确性分析
4. Toward a Standardized and More Accurate Indonesian Part-of-Speech Tagging [C] . Kemal Kurniawan, Alham Fikri Aji International Conference on Asian Language Processing . 2018

机译：朝着标准化和更准确的印度尼西亚语标记
5. IITagger: Tagging Wall Street Journal text with part-of-speech information [D] . Kim, Yeongkwun 1996

机译：IITagger：使用词性信息标记“华尔街日报”文本
6. A fine-grained Chinese word segmentation and part-of-speech tagging corpus for clinical text [O] . Ying Xiong, Zhongmin Wang, Dehuan Jiang, 2019

机译：用于临床文本的细粒度中文分词和词性标注语料库
7. Toward a Standardized and More Accurate Indonesian Part-of-Speech Tagging [O] . Kemal Kurniawan, Alham Fikri Aji 2018

机译：朝着标准化和更准确的印度尼西亚语标记

Toward a Standardized and More Accurate Indonesian Part-of-Speech Tagging

摘要

著录项

相似文献

相关主题

期刊订阅