首页> 外文会议>20th International Conference on Computational Linguistics vol.2 >Semi-Supervised Training of a Kernel PCA-Based Model for Word Sense Disambiguation
【24h】

Semi-Supervised Training of a Kernel PCA-Based Model for Word Sense Disambiguation

机译:基于内核PCA的词义消歧模型的半监督训练

获取原文
获取原文并翻译 | 示例

摘要

In this paper, we introduce a new semi-supervised learning model for word sense disambiguation based on Kernel Principal Component Analysis (KPCA), with experiments showing that it can further improve accuracy over supervised KPCA models that have achieved WSD accuracy superior to the best published individual models. Although empirical results with supervised KPCA models demonstrate significantly better accuracy compared to the state-of-the-art achieved by either naive Bayes or maximum entropy models on Senseval-2 data, we identify specific sparse data conditions under which supervised KPCA models deteriorate to essentially a most-frequent-sense predictor. We discuss the potential of KPCA for leveraging unannotated data for partially-unsupervised training to address these issues, leading to a composite model that combines both the supervised and semi-supervised models.
机译:在本文中,我们介绍了一种基于核主成分分析(KPCA)的新的半监督学习模型,用于词义歧义消除,实验表明,该模型可以进一步提高监督WPC准确性优于已发表论文的监督KPCA模型的准确性。个别模型。尽管与朴素贝叶斯模型或Senseval-2数据的最大熵模型取得的最新技术相比,监督KPCA模型的经验结果显示出明显更高的准确性,但我们确定了监督KPCA模型在本质上恶化的特定稀疏数据条件最常使用的预测变量。我们讨论了KPCA利用未注释的数据进行部分无监督训练以解决这些问题的潜力,从而形成了一个组合模型,该模型结合了监督模型和半监督模型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号