首页> 外文期刊>ACM transactions on Asian and low-resource language information processing >Developing the Persian Wordnet of Verbs Using Supervised Learning
【24h】

Developing the Persian Wordnet of Verbs Using Supervised Learning

机译:使用受监督学习开发动词的波斯Wordnet

获取原文
获取原文并翻译 | 示例

摘要

Nowadays, wordnets are extensively used as a major resource in natural language processing and information retrieval tasks. Therefore, the accuracy of wordnets has a direct influence on the performance of the involved applications. This paper presents a fully-automated method for extending a previously developed Persian wordnet to cover more comprehensive and accurate verbal entries. At first, by using a bilingual dictionary, some Persian verbs are linked to Princeton WordNet synsets. A feature set related to the semantic behavior of compound verbs as the Majority of Persian verbs is proposed. This feature set is employed in a supervised classification system to select the proper links for inclusion in the wordnet. We also benefit from a preexisting Persian wordnet, FarsNet, and a similarity-based method to produce a training set. This is the largest automatically developed Persian wordnet with more than 27,000 words, 28,000 PWN synsets and 67,000 word sense pairs that substantially outperforms the previous Persian wordnet with about 16,000 words, 22,000 PWN synsets and 38,000 word-sense pairs.
机译:如今,Wordnets广泛用作自然语言处理和信息检索任务中的主要资源。因此,Wordnets的准确性直接影响涉及应用的性能。本文提出了一种完全自动化的方法,用于扩展以前开发的波斯Wordnet,以涵盖更全面和准确的口头条目。首先,通过使用双语词典,一些波斯动词与普林斯顿Wordnet拟拟合联系。提出了与复合动词的语义行为相关的功能集,作为大多数波斯动词。此功能集在监督的分类系统中使用,以选择要包含在WordNet中的正确链接。我们还受益于预先存在的波斯语Wordnet,Farsnet和基于相似性的方法来生成培训集。这是最大的自动开发的波斯语Wordnet,具有超过27,000个单词,28,000个PWN Synpsets和67,000个词感测对,显着优于上一个波斯Wordnet,具有大约16,000字,22,000个PWN Synpsets和38,000字词对对。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号