首页> 外文会议>International Conference on Text, Speech and Dialogue >Improving Part-of-Speech Tagging by Meta-learning
【24h】

Improving Part-of-Speech Tagging by Meta-learning

机译:通过元学习改善讲话标记

获取原文

摘要

Recently, we have observed a rapid progress in the state of Part of Speech tagging for Polish. Thanks to PolEval - a shared task organized in late 2017 - many new approaches to this problem have been proposed. New deep learning paradigms have helped to narrow the gap between the accuracy of POS tagging methods for Polish and for English. Still, the number of errors made by the taggers on large corpora is very high, as even the currently best performing tagger reaches an accuracy of ca. 94.5%, which translates to millions of errors in a billion-word corpus. To further improve the accuracy of Polish POS tagging we propose to employ a meta-learning approach on top of several existing taggers. This meta-learning approach is inspired by the fact that the taggers, while often similar in terms of accuracy, make different errors, which leads to a conclusion that some of the methods are better in specific contexts than the others. We thus train a machine learning method that captures the relationship between a particular tagger accuracy and language context and in this way create a model, which makes a selection between several taggers in each context to maximize the expected tagging accuracy.
机译:最近,我们已经观察到抛光的部分语音标记的状态快速进展。由于Poleval - 2017年后组织的共享任务 - 已提出了许多对此问题的新方法。新的深度学习范式有助于缩小POS标记方法的精度与英语的POS标记方法之间的差距。尽管如此,标记器在大型电流上制作的错误数量非常高,甚至目前最好的表现标签达到了CA的准确性。 94.5%,它在十亿字的语料库中转化为数百万错误。为了进一步提高波兰POS标记的准确性,我们建议在几个现有标签的顶部采用元学习方法。这种元学习方法受到标签的影响,虽然在准确性方面通常相似,但产生不同的误差,这导致了一些结论,即某些方法在比其他方面的特定上下文更好。因此,我们培训机器学习方法,可以捕获特定标记精度和语言上下文之间的关系,并以这种方式创建模型,这在每个上下文中的多个标记之间进行选择,以最大化预期的标记精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号