首页> 外文期刊>Computational linguistics >Tree kernels for semantic role labeling
【24h】

Tree kernels for semantic role labeling

机译:树形内核,用于语义角色标记

获取原文
获取原文并翻译 | 示例
       

摘要

The availability of large scale data sets of manually annotated predicate-argument structures has recently favored the use of machine learning approaches to the design of automated semantic role labeling (SRL) systems. The main research in this area relates to the design choices for feature representation and for effective decompositions of the task in different learning models. Regarding the former choice, structural properties of full syntactic parses are largely employed as they represent ways to encode different principles suggested by the linking theory between syntax and semantics. The latter choice relates to several learning schemes over global views of the parses. For example, re-ranking stages operating over alternative predicate-argument sequences of the same sentence have shown to be very effective. In this article, we propose several kernel functions to model parse tree properties in kernel-based machines, for example, perceptrons or support vector machines. In particular, we define different kinds of tree kernels as general approaches to feature engineering in SRL. Moreover, we extensively experiment with such kernels to investigate their contribution to individual stages of an SRL architecture both in isolation and in combination with other traditional manually coded features. The results for boundary recognition, classification, and re-ranking stages provide systematic evidence about the significant impact of tree kernels on the overall accuracy, especially when the amount of training data is small. As a conclusive result, tree kernels allow for a general and easily portable feature engineering method which is applicable to a large family of natural language processing tasks.
机译:手动注释谓词-自变量结构的大规模数据集的可用性最近已促使使用机器学习方法来设计自动语义角色标记(SRL)系统。该领域的主要研究涉及特征表示的设计选择以及不同学习模型中任务的有效分解的设计选择。关于前者的选择,由于语法和语义之间的联系理论提出了完全编码语法分析的结构特性,它们代表了对不同原理进行编码的方式,因此在很大程度上被采用。后一种选择涉及在解析器的全局视图上的几种学习方案。例如,对相同句子的替代谓词-自变量序列进行操作的重新排序阶段已显示非常有效。在本文中,我们提出了一些内核函数来对基于内核的机器(例如感知器或支持向量机)中的解析树属性进行建模。特别是,我们将不同种类的树形内核定义为SRL中特征工程的通用方法。此外,我们对此类内核进行了广泛的试验,以研究它们对SRL架构各个阶段的贡献,包括隔离以及与其他传统手动编码功能的结合。边界识别,分类和重新排序阶段的结果提供了有关树核对整体准确性的重大影响的系统证据,尤其是在训练数据量较小时。结果是,树内核允许使用一种通用且易于移植的特征工程方法,该方法适用于大量自然语言处理任务。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号