【24h】

Improving Arabic Diacritization with Regularized Decoding and Adversarial Training

机译:用规则化解码和对抗训练改善阿拉伯语杂记化

获取原文

摘要

Arabic diacritization is a fundamental task for Arabic language processing. Previous studies have demonstrated that automatically generated knowledge can be helpful to this task. However, these studies regard the auto-generated knowledge instances as gold references, which limits their effectiveness since such knowledge is not always accurate and inferior instances can lead to incorrect predictions. In this paper, we propose to use regularized decoding and adversarial training to appropriately learn from such noisy knowledge for diacritization. Experimental results on two benchmark datasets show that, even with quite flawed auto-generated knowledge, our model can still learn adequate diacritics and outperform all previous studies, on both datasets.
机译:阿拉伯语变速制是阿拉伯语处理的基本任务。 以前的研究表明,自动生成的知识可能对此任务有所帮助。 然而,这些研究将自动产生的知识实例视为金参考,这限制了它们的有效性,因为这些知识并不总是准确,并且劣质的实例可能导致错误的预测。 在本文中,我们建议使用正规化的解码和对抗性培训,从而适当地从这种嘈杂的虚构知识中学习。 在两个基准数据集上的实验结果表明,即使具有相当有缺陷的自动生成的知识,我们的模型仍然可以在两个数据集中学习适当的历史记录和优于以前的所有研究。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号