首页> 外文会议>Workshop on noisy user-generated text >Minority Language Twitter: Part-of-Speech Tagging and Analysis of Irish Tweets
【24h】

Minority Language Twitter: Part-of-Speech Tagging and Analysis of Irish Tweets

机译:少数民族语言Twitter:爱尔兰推文的词性标记和分析

获取原文

摘要

Noisy user-generated text poses problems for natural language processing. In this paper, we show that this statement also holds true for the Irish language. Irish is regarded as a low-resourced language, with limited annotated corpora available to NLP researchers and linguists to fully analyse the linguistic patterns in language use in social media. We contribute to recent advances in this area of research by reporting on the development of part-of-speech annotation scheme and annotated corpus for Irish language tweets. We also report on state-of-the-art tagging results of training and testing three existing POS-taggers on our new dataset.
机译:用户生成的嘈杂文本会给自然语言处理带来麻烦。在本文中,我们证明了该声明对爱尔兰语言也适用。爱尔兰语被认为是一种资源贫乏的语言,NLP研究人员和语言学家只能使用有限的带注释语料库来全面分析社交媒体中语言使用的语言模式。我们报告了词性注释方案和爱尔兰语推文注释语料库的发展,从而为该研究领域的最新进展做出了贡献。我们还报告了在新数据集上训练和测试三个现有POS标记的最新标记结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号