【24h】

Data-driven sentence generation with non-isomorphic trees

机译:非同构树的数据驱动语句生成

获取原文

摘要

Abstract structures from which the generation naturally starts often do not contain any functional nodes, while surface-syntactic structures or a chain of tokens in a linearized tree contain all of them. Therefore, data-driven linguistic generation needs to be able to cope with the projection between non-isomorphic structures that differ in their topology and number of nodes. So far, such a projection has been a challenge in data-driven generation and was largely avoided. We present a fully stochastic generator that is able to cope with projection between non-isomorphic structures. The generator, which starts from PropBank-like structures, consists of a cascade of SVM-classifier based submodules that map in a series of transitions the input structures onto sentences. The generator has been evaluated for English on the Penn-Treebank and for Spanish on the multi-layered Ancora-UPF corpus.
机译:自然生成的抽象结构通常不包含任何功能节点,而表面语法结构或线性化树中的标记链包含所有这些功能节点。因此,数据驱动的语言生成需要能够应对拓扑和节点数量不同的非同构结构之间的投影。到目前为止,在数据驱动的生成中,这种预测一直是一个挑战,并且在很大程度上避免了这种预测。我们提出了一种完全随机的发生器,它能够应付非同构结构之间的投影。生成器从类似于PropBank的结构开始,由级联的基于SVM分类器的子模块组成,这些子模块将输入结构的一系列转换映射到句子上。已对Penn-Treebank的英语生成器和多层的Ancora-UPF语料库的西班牙语进行了评估。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号