【24h】

Unsupervised Natural Language Disambiguation Using Non-Ambiguous Words

机译:使用无歧义词的无监督自然语言歧义消除

获取原文
获取原文并翻译 | 示例

摘要

This chapter describes an unsupervised approach for natural language disambiguation, applicable to ambiguity problems where classes of equivalence can be defined over the set of words in a lexicon. Lexical knowledge is induced from non-ambiguous words via classes of equivalence and enables the automatic generation of annotated corpora. The only requirements are a lexicon and a raw textual corpus. The method was tested on two natural language ambiguity tasks in several languages: part of speech tagging (English, Swedish, Chinese) and word sense disambiguation (English, Romanian). Classifiers trained on automatically constructed corpora were found to have a performance comparable with classifiers that learn from expensive manually annotated data.
机译:本章介绍了自然语言歧义消除的无监督方法,适用于歧义问题,在歧义问题中,可以在词典中的单词集上定义等价类。词义知识是通过等价类从非歧义词中得出的,并能够自动生成带注释的语料库。唯一的要求是词典和原始文本语料库。该方法已在几种语言的两种自然语言歧义任务上进行了测试:语音标记的一部分(英语,瑞典语,中文)和词义歧义消除(英语,罗马尼亚语)。发现在自动构建的语料库上训练的分类器的性能可与从昂贵的手动注释数据中学习的分类器相媲美。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号