首页> 外文会议>Conference on empirical methods in natural language processing >Evaluating the Utility of Hand-crafted Features in Sequence Labelling
【24h】

Evaluating the Utility of Hand-crafted Features in Sequence Labelling

机译:评估手工制作特征在序列标记中的效用

获取原文

摘要

Conventional wisdom is that hand-crafted features are redundant for deep learning models, as they already learn adequate representations of text automatically from corpora. In this work, we test this claim by proposing a new method for exploiting handcrafted features as part of a novel hybrid learning approach, incorporating a feature auto-encoder loss component. We evaluate on the task of named entity recognition (NER), where we show that including manual features for part-of-spcech, word shapes and gazetteers can improve the performance of a neural CRF model. We obtain a F_1 of 91.89 for the CoNLL-2003 English shared task, which significantly outperforms a collection of highly competitive baseline models. We also present an ablation study showing the importance of auto-encoding, over using features as either inputs or outputs alone, and moreover, show including the autoencoder components reduces training requirements to 60%, while retaining the same predictive accuracy.
机译:传统观点认为,手工制作的功能对于深度学习模型是多余的,因为它们已经从语料库中自动学习了足够的文本表示形式。在这项工作中,我们通过提出一种利用手工制作的特征的新方法(作为一种新颖的混合学习方法的一部分),结合了特征自动编码器损耗成分,来测试这种说法。我们对命名实体识别(NER)的任务进行了评估,该研究表明,包括词性,单词形状和地名词典的手动功能可以改善神经CRF模型的性能。对于CoNLL-2003英语共享任务,我们获得的F_1为91.89,这大大超过了竞争激烈的基准模型的集合。我们还进行了一项消融研究,表明与使用特征单独用作输入或输出相比,自动编码的重要性,此外,还显示了包括自动编码器组件在内,可以将训练要求降低至60%,同时保持相同的预测准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号