【24h】

Character Eyes: Seeing Language through Character-Level Taggers

机译:角色之眼:通过角色级别的匕首看语言

获取原文

摘要

Character-level models have been used extensively in recent years in NLP tasks as both supplements and replacements for closed-vocabulary token-level word representations. In one popular architecture, character-level LSTMs are used to feed token representations into a sequence tagger predicting token-level annotations such as part-of-speech (POS) tags. In this work, we examine the behavior of POS taggers across languages from the perspective of individual hidden units within the character LSTM. We aggregate the behavior of these units into language-level metrics which quantify the challenges that taggers face on languages with different morphological properties, and identify links between synthesis and affixation preference and emergent behavior of the hidden tagger layer. In a comparative experiment, we show how modifying the balance between forward and backward hidden units affects model arrangement and performance in these types of languages.
机译:近年来,字符级模型已在NLP任务中广泛使用,作为闭合词汇标记级单词表示的补充和替代。在一种流行的体系结构中,字符级LSTM用于将令牌表示提供给预测令牌级注释(例如词性(POS)标签)的序列标记器。在这项工作中,我们从字符LSTM中单个隐藏单元的角度检查了跨语言的POS标记器的行为。我们将这些单元的行为汇总到语言级别的度量标准中,以量化标记者在具有不同形态特征的语言上面临的挑战,并确定合成和附加偏好与隐藏标记者层的紧急行为之间的联系。在一个比较实验中,我们展示了修改前后隐藏单元之间的平衡如何影响这些类型的语言中的模型排列和性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号