首页> 外文会议>Annual meeting of the Association for Computational Linguistics >Cross-Domain Generalization of Neural Constituency Parsers

Cross-Domain Generalization of Neural Constituency Parsers




Neural parsers obtain state-of-the-art results on benchmark treebanks for constituency parsing—but to what degree do they generalize to other domains? We present three results about the generalization of neural parsers in a zero-shot setting: training on trees from one corpus and evaluating on out-of-domain corpora. First, neural and non-neural parsers generalize comparably to new domains. Second, incorporating pre-trained encoder representations into neural parsers substantially improves their performance across all domains, but does not give a larger relative improvement for out-of-domain treebanks. Finally, despite the rich input representations they learn, neural parsers still benefit from structured output prediction of output trees, yielding higher exact match accuracy and stronger generalization both to larger text spans and to out-of-domain corpora. We analyze generalization on English and Chinese corpora, and in the process obtain state-of-the-art parsing results for the Brown, Genia, and English Web treebanks.
机译:神经解析器可以在基准树库上获得最新的结果以进行选区分析,但是它们在多大程度上可以推广到其他领域?我们在零镜头设置中呈现了有关神经解析器泛化的三个结果:训练来自一个语料库的树并评估域外语料库。首先,神经解析器和非神经解析器可概括地推广到新领域。其次,将经过预训练的编码器表示形式合并到神经解析器中,可以显着提高其在所有域中的性能,但对于域外树库则不会带来较大的相对改进。最后,尽管他们学习了丰富的输入表示形式,但神经解析器仍然受益于输出树的结构化输出预测,从而对更大的文本范围和域外语料库具有更高的精确匹配精度和更强的泛化能力。我们分析了英语和汉语语料库的泛化,并在此过程中获得了Brown,Genia和English Web树库的最新分析结果。



  • 外文文献
  • 中文文献
  • 专利


京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号