【24h】

A Baseline Methodology for Word Sense Disambiguation

机译:词义歧义的基线方法

获取原文

摘要

This paper describes a methodology for supervised word sense disambiguation that relies on standard machine learning algorithms to induce classifiers from sense-tagged training examples where the context in which ambiguous words occur are represented by simple lexical features. This constitutes a baseline approach since it produces classifiers based on easy to identify features that result in accurate disambiguation across a variety of languages. This paper reviews several systems based on this methodology that participated in the Spanish and English lexical sample tasks of the SENSEVAL-2 comparative exercise among word sense disambiguation systems. These systems fared much better than standard baselines, and were within seven to ten percentage points of accuracy of the mostly highly ranked systems.
机译:本文介绍了监督词感歧义的方法,依赖于标准机器学习算法诱导来自感应标记的训练示例的分类器,其中通过简单的词汇特征来表示模糊词的上下文。这构成了基线方法,因为它产生了基于易于识别导致各种语言的准确歧义的功能的分类器。本文评估了基于该方法的几个系统,参与了Word Sense消歧系统中的SenseVal-2比较练习的西班牙语和英语词汇样本任务。这些系统比标准基线更好,并且在大多数高度排名系统的七到百分点内。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号