首页> 外文会议>Workshop on deep learning approaches for low-resource natural language processing 2018 >Training a Neural Network in a Low-Resource Setting on Automatically Annotated Noisy Data
【24h】

Training a Neural Network in a Low-Resource Setting on Automatically Annotated Noisy Data

机译:在资源不足的情况下根据自动注释的噪声数据训练神经网络

获取原文
获取原文并翻译 | 示例

摘要

Manually labeled corpora are expensive to create and often not available for low-resource languages or domains. Automatic labeling approaches are an alternative way to obtain labeled data in a quicker and cheaper way. However, these labels often contain more errors which can deteriorate a classifier's performance when trained on this data. We propose a noise layer that is added to a neural network architecture. This allows modeling the noise and train on a combination of clean and noisy data. We show that in a low-resource NER task we can improve performance by up to 35% by using additional, noisy data and handling the noise.
机译:手动标记的语料库创建起来很昂贵,而且通常不适用于资源匮乏的语言或域。自动标记方法是一种以更快,更便宜的方式获取标记数据的替代方法。但是,这些标签通常包含更多错误,当根据该数据进行训练时,这些错误可能会降低分类器的性能。我们建议将噪声层添加到神经网络体系结构中。这样就可以对噪声进行建模,并结合干净和嘈杂的数据进行训练。我们表明,在低资源的NER任务中,我们可以通过使用其他嘈杂的数据并处理噪声来将性能提高35%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号