首页> 外文期刊>Computational linguistics >Representing Discourse Coherence: A Corpus-Based Study
【24h】

Representing Discourse Coherence: A Corpus-Based Study

机译:代表语篇连贯性:基于语料库的研究

获取原文
获取原文并翻译 | 示例
       

摘要

This article aims to present a set of discourse structure relations that are easy to code and to develop criteria for an appropriate data structure for representing these relations. Discourse structure here refers to informational relations that hold between sentences in a discourse. The set of discourse relations introduced here is based on Hobbs (1985). We present a method for annotating discourse coherence structures that we used to manually annotate a database of 135 texts from the Wall Street Journal and the AP Newswire. All texts were independently annotated by two annotators. Kappa values of greater than 0.8 indicated good interannotator agreement. We furthermore present evidence that trees are not a descriptively adequate data structure for representing discourse structure: In coherence structures of naturally occurring texts, we found many different kinds of crossed dependencies, as well as many nodes with multiple parents. The claims are supported by statistical results from our hand-annotated database of 135 texts.
机译:本文旨在介绍一组易于编码的话语结构关系,并为表示这些关系的适当数据结构开发标准。话语结构在这里是指在话语中句子之间保持的信息关系。这里介绍的话语关系集是基于霍布斯(Hobbs,1985)的。我们提供了一种用于注释话语连贯结构的方法,该方法用于手动注释《华尔街日报》和AP Newswire的135种文本的数据库。所有文本均由两个注释者独立注释。 Kappa值大于0.8表示良好的批注者一致性。我们进一步提供的证据表明,树不是表示话语结构的描述性足够的数据结构:在自然发生的文本的连贯结构中,我们发现了许多不同种类的交叉依存关系,以及许多具有多个父级的节点。我们的135个文本的手工注释数据库提供的统计结果支持了这些声明。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号