首页> 外文会议>Pacific Asia Conference on Language, Information and Computation >Augmented Parsing of Unknown Word by Graph-based Semi-supervised Learning
【24h】

Augmented Parsing of Unknown Word by Graph-based Semi-supervised Learning

机译:基于图的半监督学习对未知词的增强解析

获取原文

摘要

This paper presents a novel method using graph-based semi-supervised learning (SSL) to improve the syntax parsing of unknown words. Different from conventional approaches that uses hand-crafted rules, rich morphological features, or a character-based model to handle unknown words, this method is based on a graph-based label propagation technique. It gives greater improvement on grammars trained on a smaller amount of labeled data and a large amount of unlabeled one. A transductiv graph-based SSL method is employed to propagate POS and derive the emission distributions from labeled data to unlabeled one. The derived distributions are incorporated into the parsing process. The proposed method effectively augments the original supervised parsing model by contributing 2.28% and 1.72% absolute improvement on the accuracy of POS tagging and syntax parsing for Penn Chinese Treebank respectively.
机译:本文提出了一种基于图形的半监督学习(SSL)改进未知词语法解析的新方法。与使用手工规则,丰富的形态特征或基于字符的模型来处理未知单词的常规方法不同,此方法基于基于图的标签传播技术。它对使用较少数量的标记数据和大量未标记的数据训练的语法提供了更大的改进。采用基于导图的SSL方法传播POS,并从标记数据到未标记数据推导发射分布。派生的分布被合并到解析过程中。所提出的方法通过分别对Penn Chinese Treebank的POS标记和语法解析的准确性分别贡献了2.28%和1.72%的绝对改进,有效地增强了原始的监督解析模型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号