首页> 外国专利> Method and system for simplifying implicit rhetorical relation prediction in large scale annotated corpus

Method and system for simplifying implicit rhetorical relation prediction in large scale annotated corpus

机译:简化大规模带注释语料中隐式修辞关系预测的方法和系统

摘要

The present invention provides a method and system directed to predicting implicit rhetorical relations between two spans of text, e.g., in a large annotated corpus, such as the Penn Discourse Treebank ("PDTB"), Rhetorical Structure Theory corpus, and the Discourse Graph Bank, and particularly directed to determining a rhetorical relation in the absence of an explicit discourse marker. Surface level features may be used to capture pragmatic information encoded in the absent marker. In one manner a simplified feature set based only on raw text and semantic dependencies is used to improve performance for all relations. By using surface level features to predict implicit rhetorical relations for the large annotated corpus the invention approaches a theoretical maximum performance, suggesting that more data will not necessarily improve performance based on these and similarly situated features.
机译:本发明提供了一种方法和系统,该方法和系统用于预测两个文本范围之间的隐式修辞关系,例如,在大型带注释的语料库中,例如宾大话语树库(“ PDTB”),修辞结构理论语料库和语篇图库,尤其是在没有明确的话语标记的情况下确定修辞关系。表面水平特征可用于捕获在不存在的标记中编码的实用信息。在一种方式中,仅基于原始文本和语义依赖性的简化功能集可用于改善所有关系的性能。通过使用表面水平特征来预测大的注释语料的隐式修辞关系,本发明达到了理论上的最大性能,这表明更多的数据将不一定基于这些和类似位置的特征来改善性能。

著录项

  • 公开/公告号AU2014285073B9

    专利类型

  • 公开/公告日2017-04-06

    原文格式PDF

  • 申请/专利权人 THOMSON REUTERS GLOBAL RESOURCES;

    申请/专利号AU20140285073

  • 发明设计人 HOWALD BLAKE;NYSTROM ANDREW;

    申请日2014-07-03

  • 分类号G06F17/30;G06F15/18;

  • 国家 AU

  • 入库时间 2022-08-21 13:33:06

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号