首页> 外文会议>Workshop on NLP for Similar Languages, Varieties and Dialects >Whit's the Richt Pairt o Speech: PoS tagging for Scots
【24h】

Whit's the Richt Pairt o Speech: PoS tagging for Scots

机译:惠特的Richt Bairt O语音:POS标记苏格兰人

获取原文

摘要

In this paper we explore PoS tagging for the Scots language. Scots is spoken in Scotland and Northern Ireland, and is closely related to English. As no linguistically annotated Scots data were available, we manually PoS tagged a small set that is used for evaluation and training. We use English as a transfer language to examine zero-shot transfer and transfer learning methods. We find that training on a very small amount of Scots data was superior to zero-shot transfer from English. Combining the Scots and English data led to further improvements, with a concatenation method giving the best results. We also compared the use of two different English treebanks and found that a treebank containing web data was superior in the zero-shot setting, while it was outperformed by a treebank containing a mix of genres when combined with Scots data.
机译:在本文中,我们探索苏格兰语言的POS标记。 苏格兰和北爱尔兰举办的苏格兰人,并与英语密切相关。 由于没有可用的语言增注苏格兰数据,我们手动POS标记了用于评估和培训的小型集。 我们使用英语作为传输语言来检查零拍摄传输和转移学习方法。 我们发现,对苏格兰数据的培训优于英语零拍摄。 结合苏格兰和英语数据导致进一步改进,具有级联方法提供最佳结果。 我们还比较了两个不同的英语树木银行的使用,发现包含Web数据的树木库在零拍设置中优越,而在与苏格兰数据结合时,它的TreeBank也越擅长。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号