首页> 外文会议>Insternational Joint Conference on Natural Language Processing >Concept-Based Sense Disambiguation for Korean Nouns
【24h】

Concept-Based Sense Disambiguation for Korean Nouns

机译:基于概念的韩语名词的感觉歧义

获取原文
获取外文期刊封面目录资料

摘要

Most previous corpus-based approaches to word-sense disambiguation (WSD) collect salient words from the context of a target word. However, they suffer from the problem of data sparseness. To overcome the problem, this paper proposes a concept-based WSD method that uses an automatically generated sense-tagged corpus. Grammatical similarities between Korean and Japanese enable the construction of a sense-tagged Korean corpus through an existing high-quality Japanese-to-Korean machine translation system. The sense-tagged corpus can serve as a knowledge source to extract useful clues for word sense disambiguation, such as concept co-occurrence information. The proposed WSD model is applied to a Ko-rean-to-Japanese MT system that experimented with various. machine learning methods. In an evaluation, a weighted voting model achieved the best average precision of 81.50%, with an improvement over the baseline by 18.75%, which shows that our proposed method is very promising for practical MT systems.
机译:最先前的基于语料库的词学歧义歧义(WSD)从目标字的上下文中收集突出字。然而,他们遭受了数据稀疏的问题。为了克服这个问题,本文提出了一种基于概念的WSD方法,它使用自动生成的Sensaged Corpus。韩国和日语之间的语法相似之处能够通过现有的高质量日本电机翻译系统建造一个感应标签的韩国语料库。感觉标记的语料库可以作为提取有用线索的知识源,以获取单词感应消歧,例如概念共同发生信息。所提出的WSD模型应用于试验各种的KO-REAN-TO-Japanial MT系统。机器学习方法。在评估中,加权投票模型实现了81.50%的最佳平均精度,在基线上提高了18.75%,表明我们所提出的方法对于实际MT系统非常有前途。

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号