首页> 外文会议>International Conference on Computational Semantics >Leveraging a Semantically Annotated Corpus to Disambiguate Prepositional Phrase Attachment
【24h】

Leveraging a Semantically Annotated Corpus to Disambiguate Prepositional Phrase Attachment

机译:利用语义注释的语料库消除介词短语的歧义

获取原文

摘要

Accurate parse ranking requires semantic information, since a sentence may have many candidate parses involving common syntactic constructions. In this paper, we propose a probabilistic framework for incorporating distributional semantic information into a maximum entropy parser. Furthermore, to better deal with sparse data, we use a modified version of Latent Dirichlet Allocation to smooth the probability estimates. This LDA model generates pairs of lemmas, representing the two arguments of a semantic relation, and can be trained, in an unsupervised manner, on a corpus annotated with semantic dependencies. To evaluate our framework in isolation from the rest of a parser, we consider the special case of prepositional phrase attachment ambiguity. The results show that our semantically-motivated feature is effective in this case, and moreover, the LDA smoothing both produces semantically interpretable topics, and also improves performance over raw co-occurrence frequencies, demonstrating that it can successfully generalise patterns in the training data.
机译:准确的语法分析排名需要语义信息,因为一个句子可能具有许多涉及常见句法结构的候选语法分析。在本文中,我们提出了一个概率框架,用于将分布语义信息合并到最大熵解析器中。此外,为了更好地处理稀疏数据,我们使用了潜在狄利克雷分配的修改版本来平滑概率估计。该LDA模型生成一对词元,代表语义关系的两个自变量,并且可以在无监督的情况下在标注有语义依赖性的语料库上进行训练。为了与解析器的其余部分隔离地评估我们的框架,我们考虑介词短语附件歧义的特殊情况。结果表明,我们的语义动机特征在这种情况下是有效的,此外,LDA平滑不仅可以产生语义上可解释的主题,而且还可以提高原始共现频率的性能,这表明它可以成功地概括训练数据中的模式。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号