首页> 外文会议>International conference on recent advances in natural language processing >Extraction of the Multiword Lexical Units in the Perspective of the Wordnet Expansion
【24h】

Extraction of the Multiword Lexical Units in the Perspective of the Wordnet Expansion

机译:Wordnet扩展视角下的多词词法单元提取

获取原文

摘要

The paper focuses on selecting an optimal set of the Multiword Expressions Extraction methods used as a tool during word-net expansion. Wordnet multiword lexical units are a broad class and it is difficult to find a single extraction method fulfilling the task. Many extraction association measures were tested on very large corpora and a very large wordnet, namely plWordNet. Several new measures are proposed and compared with selected methods in the literature. Two ways of combining measures into ensembles were analysed too. We showed that method selection and the tuning of their parameters can be transferred between two large corpora. The comparison of the extracted collocations with the huge set of plWordNet multiword lexical units revealed that the performance of the methods is much below the optimistic levels reported in the literature. However, the carefully selected set and combination of the methods can be a valuable tool for lexicographers.
机译:本文着重于选择最佳的多词表达提取方法集,将其用作词网扩展过程中的工具。 Wordnet多词词汇单元是一个广泛的类别,很难找到一种可以完成任务的提取方法。在很大的语料库和很大的词网(即plWordNet)上测试了许多提取关联度量。提出了几种新的措施,并将其与文献中选定的方法进行了比较。还分析了将度量组合为合奏的两种方法。我们表明,方法选择及其参数的调整可以在两个大型语料库之间转移。将提取的搭配与庞大的plWordNet多单词词法单元集进行比较后发现,该方法的性能远低于文献报道的乐观水平。但是,精心选择的方法集和方法组合对于词典编纂者可能是有价值的工具。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号