首页> 外文期刊>ACM transactions on Asian language information processing >A CDT-Styled End-to-End Chinese Discourse Parser
【24h】

A CDT-Styled End-to-End Chinese Discourse Parser

机译:CDT风格的端到端中文话语解析器

获取原文
获取原文并翻译 | 示例

摘要

Discourse parsing is a challenging task and plays a critical role in discourse analysis. Since the release of the Rhetorical Structure Theory Discourse Treebank and the Penn Discourse Treebank, the research on English discourse parsing has attracted increasing attention and achieved considerable success in recent years. At the same time, some preliminary research on certain subtasks about discourse parsing for other languages, such as Chinese, has been conducted. In this article, we present an end-to-end Chinese discourse parser with the Connective-Driven Dependency Tree scheme, which consists of multiple components in a pipeline architecture, such as the elementary discourse unit (EDU) detector, discourse relation recognizer, discourse parse tree generator, and attribution labeler. In particular, the attribution labeler determines two attributions (i.e., sense and centering) for every nonterminal node (i.e., discourse relation) in the discourse parse trees. Systematically, our parser detects all EDUs in a free text, generates the discourse parse tree in a bottom-up way, and determines the sense and centering attributions for all nonterminal nodes by traversing the discourse parse tree. Comprehensive evaluation on the Connective-Driven Dependency Treebank corpus from both component-wise and error-cascading perspectives is conducted to illustrate how each component performs in isolation, and how the pipeline performs with error propagation. Finally, it shows that our end-to-end Chinese discourse parser achieves an overall F1 score of 20% with full automation.
机译:语篇解析是一项艰巨的任务,在语篇分析中起着至关重要的作用。自从修辞结构理论话语树库和宾语话语树库的问世以来,近年来英语话语解析研究受到了越来越多的关注,并取得了可喜的成就。同时,对某些其他任务(例如中文)的语篇解析进行了一些初步研究。在本文中,我们介绍了一种具有连接驱动依赖性树方案的端到端中文语篇解析器,该解析器由流水线体系结构中的多个组件组成,例如基本语篇单元(EDU)检测器,语篇关系识别器,语篇解析树生成器和归因标签。特别地,属性标记器为话语分析树中的每个非终端节点(即,话语关系)确定两个属性(即,感觉和居中)。系统地,我们的解析器检测自由文本中的所有EDU,以自下而上的方式生成话语分析树,并通过遍历话语分析树来确定所有非终端节点的意义和中心属性。从组件方面和错误级联的角度对连接驱动依赖性树库语料库进行了综合评估,以说明每个组件如何独立执行,以及管道如何在错误传播下执行。最后,它表明我们的端到端中文话语解析器在完全自动化的情况下达到了20%的整体F1分数。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号