首页> 外文会议>Conference on empirical methods in natural language processing >Portable, layer-wise task performance monitoring for NLP models

Portable, layer-wise task performance monitoring for NLP models




There is a long-standing interest in understanding the internal behavior of neural networks (Touret-zky and Pomerleau, 1989; Zhou et al., 2017; Raghu et al., 2017; Alishahi et al., 2017). Deep neural architectures for natural language processing (NLP) are often accompanied by explanations for their effectiveness, from general observations (e.g. RNNs can represent unbounded dependencies in a sequence) to specific arguments about linguistic phenomena (early layers encode lexical information, deeper layers syntactic). The recent ascendancy of DNNs is fueling efforts in the NLP community to explore these claims (Belinkov et al., 2017; Dalvi et al., 2017; Karpathy et al., 2015; Kadar et al., 2016; Kohn, 2015; Qian et al., 2016a). Previous work has tended to focus on easily-accessible representations like word or sentence embeddings (Kohn, 2015; Qian et al., 2016b; Adi et al., 2016), with deeper structure requiring more ad hoc methods to extract and examine (Belinkov and Glass, 2017; Poliaket al., 2018). In this work, we introduce Vivisect, a toolkit that aims at a general solution for broad and fine-grained monitoring in the major DNN frameworks, with minimal change to research patterns. Vivisect is general enough to serve as a less-polished version of the widely-used TensorBoard tool, but has several priorities that set it apart:? Minimal invasiveness (e.g. no SummaryOps) 1. Low resource use (only keep final metrics) 2. Uniform support for major DNN frameworks 3. Monitor performance on auxiliary tasks
机译:对理解神经网络的内部行为(Touret-zky和Pomerlau,1989,1989;周等人,2017;拉格州等,2017; Alishahi等,2017)。用于自然语言处理(NLP)的深度神经结构常常伴随着其有效性的解释,从一般观察(例如,RNN可以以序列表示无限的依赖关系)到关于语言现象的特定参数(早期层编码词汇信息,更深层的层句法) 。最近的DNN升级在NLP社区中的努力促进这些索赔(Belinkov等,2017; Dalvi等,2017; Karpathy等,2015; Kadar等,2016; Kohn,2015;钱等,2016a)。以前的工作倾向于专注于易于访问的表现形式,如Word或句嵌入(Kohn,2015; Qian等人,2016b; Adi等,2016),具有更深的结构,需要更多的临时方法来提取和检查(Belinkov和玻璃,2017; Poliaket Al。,2018)。在这项工作中,我们介绍了一种工具包,该工具包旨在掌握在主要DNN框架中的广泛和细粒度监测的一般解决方案,对研究模式的最小变化。 vivisect是一般的,足以作为广泛使用的Tensorboard工具的较少抛光版本,但是有几个优先级将其设为分开:最小侵入性(例如,没有摘要)1.资源使用低(仅保留最终度量标准)2。对主要DNN框架的统一支持3.监视辅助任务的性能



  • 外文文献
  • 中文文献
  • 专利


京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号