首页> 外文会议>Pacific Asia Conference on Language, Information and Computation >Using Stanford Part-of-Speech Tagger for the Morphologically-rich Filipino Language
【24h】

Using Stanford Part-of-Speech Tagger for the Morphologically-rich Filipino Language

机译:将Stanford词性标注器用于形态丰富的菲律宾语

获取原文

摘要

This research focuses on the implementation of a Maximum Entropy-based Part-of-Speech (POS) tagger for Filipino. It uses the Stanford POS tagger - a trainable POS tagger that has been trained on English, Chinese, Arabic, and other languages and producing one of the highest results in each language. The tagger was trained for Filipino using a 406k token corpus and considering unique Filipino linguistic phenomena such as high morphology and intra-sentential code-switches. The Filipino POS tagger resulted to 96.15% tagging accuracy which currently presents the highest accuracy and with a large lead among existing POS taggers for Filipino.
机译:这项研究的重点是针对菲律宾语的基于最大熵的词性(POS)标记器的实现。它使用Stanford POS标记器-一种可训练的POS标记器,已经过英语,中文,阿拉伯语和其他语言的培训,并且在每种语言中产生的结果最高。标记器使用406k令牌语料库进行了菲律宾语培训,并考虑了独特的菲律宾语言现象,例如高形态和句内代码转换。菲律宾POS标记器的标记准确率达到96.15%,这目前是最高的准确性,在菲律宾现有POS标记器中具有领先优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号