首页> 外文会议>Conference on empirical methods in natural language processing >Reporting Score Distributions Makes a Difference: Performance Study of LSTM-networks for Sequence Tagging
【24h】

Reporting Score Distributions Makes a Difference: Performance Study of LSTM-networks for Sequence Tagging

机译:报告分数分布有所不同:用于序列标记的LSTM网络的性能研究

获取原文

摘要

In this paper we show that reporting a single performance score is insufficient to compare non-deterministic approaches. We demonstrate for common sequence tagging tasks that the seed value for the random number generator can result in statistically significant (p < 10~(-4)) differences for state-of-the-art systems. For two recent systems for NER, we observe an absolute difference of one percentage point F_1-score depending on the selected seed value, making these systems perceived either as state-of-the-art or mediocre. Instead of publishing and reporting single performance scores, we propose to compare score distributions based on multiple executions. Based on the evaluation of 50.000 LSTM-networks for five sequence tagging tasks, we present network architectures that produce both superior performance as well as are more stable with respect to the remaining hyperparameters. The full experimental results are published in (Reimers and Gurevych, 2017). The implementation of our network is publicly available.
机译:在本文中,我们证明了报告单个性能得分不足以比较非确定性方法。对于常见的序列标记任务,我们证明了对于最新系统,随机数生成器的种子值可能会导致统计上显着的差异(p <10〜(-4))。对于最近的两个NER系统,我们观察到取决于所选种子值的绝对百分比F_1分数,这使这些系统被认为是最先进的或中等的。我们建议不发布和报告单个性能得分,而是建议基于多个执行比较得分分布。基于对5个序列标记任务的50.000 LSTM网络的评估,我们提出了网络体系结构,该体系结构可产生优异的性能,并且相对于其余超参数更稳定。完整的实验结果发表在(Reimers和Gurevych,2017)中。我们网络的实施是公开可用的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号