首页> 外文会议>International conference on recent advances in natural language processing >Improving Discourse Relation Projection to Build Discourse Annotated Corpora
【24h】

Improving Discourse Relation Projection to Build Discourse Annotated Corpora

机译:改进话语关系投射构建话语注释语料库

获取原文

摘要

The naive approach to annotation projection is not effective to project discourse annotations from one language to another because implicit discourse relations are often changed to explicit ones and vice-versa in the translation. In this paper, we propose a novel approach based on the intersection between statistical word-alignment models to identify unsupported discourse annotations. This approach identified 65% of the unsupported annotations in the English-French parallel sentences from Europarl. By filtering out these unsupported annotations, we induced the first PDTB-style discourse annotated corpus for French from Europarl. We then used this corpus to train a classifier to identify the discourse-usage of French discourse connectives and show a 15% improvement of F1-score compared to the classifier trained on the non-filtered annotations.
机译:幼稚的注释投影方法不能有效地将话语注释从一种语言投影到另一种语言,因为在翻译中,隐式的言语关系通常会更改为显性的,反之亦然。在本文中,我们提出了一种基于统计词对齐模型之间的交集的新颖方法来识别不受支持的话语注释。这种方法在Europarl的英语-法语平行句子中识别了65%不受支持的注释。通过滤除这些不受支持的注释,我们从Europarl导出了第一个PDTB样式的话语注释语料库。然后,我们使用该语料库来训练分类器,以识别法语话语连接词的话语用法,并显示与未过滤注释上训练的分类器相比,F1分数提高了15%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号