首页> 外文会议>International joint conference on natural language processing >Split or Merge: Which is Better for Unsupervised RST Parsing?
【24h】

Split or Merge: Which is Better for Unsupervised RST Parsing?

机译:分裂或合并:哪个更适合无人监督的解析?

获取原文

摘要

Rhetorical Structure Theory (RST) parsing is crucial for many downstream NLP tasks that require a discourse structure for a text. Most of the previous RST parsers have been based on supervised learning approaches. That is, they require an annotated corpus of sufficient size and quality, and heavily rely on the language and domain dependent corpus. In this paper, we present two language-independent unsupervised RST parsing methods based on dynamic programming. The first one builds the optimal tree in terms of a dissimilarity score function that is defined for splitting a text span into smaller ones. The second builds the optimal tree in terms of a similarity score function that is defined for merging two adjacent spans into a large one. Experimental results on English and German RST treebanks showed that our parser based on span merging achieved the best score, around 0.8 F_1 score, which is close to the scores of the previous supervised parsers.
机译:修辞结构理论(RST)解析对于需要文本的话语结构的许多下游NLP任务至关重要。大多数以前的RST解析器都是基于受监督的学习方法。也就是说,它们需要有足够的大小和质量的注释语料库,并严重依赖语言和域依赖语料库。在本文中,我们提出了一种基于动态编程的独立无关的无人监督解析方法。第一个在定义为较小的分割文本跨度拆分为较小的相提觉的分数函数而构建最佳树。第二个在用于将两个相邻的跨度合并到大一个方面的相似度得分函数方面构建最佳树。英语和德国RST TreeBanks的实验结果表明,我们的解析器基于SPAN合并实现了最佳分数,约为0.8 F_1得分,这与先前监督解析器的分数接近。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号