首页> 外文会议>International conference on computational linguistics >Combining Natural and Artificial Examples to Improve Implicit Discourse Relation Identification
【24h】

Combining Natural and Artificial Examples to Improve Implicit Discourse Relation Identification

机译:结合自然实例与人工实例以改进内隐语篇关系识别

获取原文

摘要

This paper presents the first experiments on identifying implicit discourse relations (i.e., relations lacking an overt discourse connective) in French. Given the little amount of annotated data for this task, our system resorts to additional data automatically labeled using unambiguous connectives, a method introduced by (Marcu and Echihabi, 2002). We first show that a system trained solely on these artificial data does not generalize well to natural implicit examples, thus echoing the conclusion made by (Sporleder and Lascarides, 2008) for English. We then explain these initial results by analyzing the different types of distribution difference between natural and artificial implicit data. This finally leads us to propose a number of very simple methods, all inspired from work on domain adaptation, for combining the two types of data. Through various experiments on the French annodis corpus, we show that our best system achieves an accuracy of 41.7%, corresponding to a 4.4% significant gain over a system solely trained on manually labeled data.
机译:本文介绍了法语中识别隐性话语关系(即缺乏明显的话语连接词的关系)的第一个实验。鉴于用于此任务的注释数据很少,我们的系统求助于使用明确的连接词自动标记的其他数据,该方法由(Marcu and Echihabi,2002)引入。我们首先表明,仅对这些人工数据进行训练的系统不能很好地推广到自然的隐含示例,从而呼应了(Sporleder and Lascarides,2008)对英语的结论。然后,我们通过分析自然和人工隐式数据之间不同类型的分布差异来解释这些初始结果。最终,这使我们提出了许多非常简单的方法,这些方法都是从领域适应性工作中汲取灵感的,用于组合两种类型的数据。通过对法国annodis语料库进行的各种实验,我们表明,我们的最佳系统达到了41.7%的准确度,相对于仅通过人工标记的数据进行训练的系统而言,其显着提高了4.4%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号