首页> 外文会议>Pacific Asia Conference on Language, Information and Computation >Using Stanford Part-of-Speech Tagger for the Morphologically-rich Filipino Language
【24h】

Using Stanford Part-of-Speech Tagger for the Morphologically-rich Filipino Language

机译:使用斯坦福代表言语标记为形态学上富有的菲律宾语言

获取原文

摘要

This research focuses on the implementation of a Maximum Entropy-based Part-of-Speech (POS) tagger for Filipino. It uses the Stanford POS tagger - a trainable POS tagger that has been trained on English, Chinese, Arabic, and other languages and producing one of the highest results in each language. The tagger was trained for Filipino using a 406k token corpus and considering unique Filipino linguistic phenomena such as high morphology and intra-sentential code-switches. The Filipino POS tagger resulted to 96.15% tagging accuracy which currently presents the highest accuracy and with a large lead among existing POS taggers for Filipino.
机译:本研究侧重于实现菲律宾人的最大熵的语音(POS)标记器。它使用斯坦福POS标签 - 一种培训POS标签,由英语,中文,阿拉伯语和其他语言进行培训,并在每种语言中产生最高结果之一。标签用于菲律宾人使用406K令牌语料库培训,并考虑独特的菲律宾语言学现象,如高形态和内容码交换机。菲律宾POS标记器导致标记精度为96.15%,目前呈现最高的精度,并且在菲律宾人的现有POS标记中具有大的铅。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号