首页> 外文会议>Insternational Joint Conference on Natural Language Processing >Improving Word Sense Disambiguation by Pseudo Samples
【24h】

Improving Word Sense Disambiguation by Pseudo Samples

机译:提高伪样本的词语歧义

获取原文

摘要

Data sparseness is a major problem in word sense disambiguation. Automatic sample acquisition and smoothing are two ways that have been explored to alleviate the influence of data sparseness. In this paper, we consider a combination of these two methods. Firstly, we propose a pattern-based way to acquire pseudo samples, and then we estimate conditional probabilities for variables by combining pseudo data set with sense tagged data set. By using the combinational estimation, we build an appropriate leverage between the two different data sets, which is vital to achieve the best performance. Experiments show that our approach brings significant improvement for Chinese word sense disambiguation.
机译:数据稀疏是一词感歧义的主要问题。 自动样品采集和平滑是两种方式,以缓解数据稀疏性的影响。 在本文中,我们考虑了这两种方法的组合。 首先,我们提出了一种基于模式的方式来获取伪样本,然后通过组合具有感测标记数据集的伪数据集来估计变量的条件概率。 通过使用组合估计,我们在两个不同的数据集之间建立适当的杠杆,这对于实现最佳性能至关重要。 实验表明,我们的方法带来了汉语词语消济歧义的显着改善。

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号