【24h】

Evaluating Roget's Thesauri

机译:评估Roget的Thesauri

获取原文

摘要

Roget's Thesaurus has gone through many revisions since it was first published 150 years ago. But how do these revisions affect Roget's usefulness for NLP? We examine the differences in content between the 1911 and 1987 versions of Roget's, and we test both versions with each other and WordNet on problems such as synonym identification and word relatedness. We also present a novel method for measuring sentence relatedness that can be implemented in either version of Roget's or in WordNet. Although the 1987 version of the Thesaurus is better, we show that the 1911 version performs surprisingly well and that often the differences between the versions of Roget's and WordNet are not statistically significant. We hope that this work will encourage others to use the 1911 Roget's Thesaurus in NLP tasks.
机译:自150年前首次发布以来,Roget的同义词库经历了许多修订。但是这些修订如何影响Roget对NLP的有用性?我们研究了1911年和1987年的Roget版本在内容上的差异,并就同义词识别和单词相关性等问题相互测试了两个版本以及WordNet。我们还提出了一种新的测量句子相关性的方法,该方法可以在Roget版本或WordNet中实现。尽管1987年的词库比较好,但我们证明1911年的版本表现出奇地好,而且Roget和WordNet的版本之间的差异通常在统计上并不显着。我们希望这项工作会鼓励其他人在NLP任务中使用1911年Roget词库。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号