首页> 外文会议>Global Wordnet Conference >Practical Approach on Implementation of WordNets for South African Languages

Practical Approach on Implementation of WordNets for South African Languages




This paper proposes the implementation of WordNets for five South African languages, namely, Sepedi, Setswana, Tshiv-enda, isiZulu and isiXhosa to be added to open multilingual WordNets (OMW) on natural language toolkit (NLTK). The African WordNets are converted from Princeton WordNet (PWN) 2.0 to 3.0 to match the synsets in PWN 3.0. After conversion, there were 7157, 11972, 1288, 6380, and 9460 lemmas for Sepedi, Setswana, Tshivenda, isiZulu and isiX-hosa respectively. Setswana, isiXhosa. Sepedi contains more lemmas compared to 8 languages in OMW and isiZulu contains more lemmas compared to 7 languages in OMW. A library has been published for continuous development of African Word-Nets in OMW using NLTK.
机译:本文提出了为五种南非语言,即Sepedi,Setswana,Tshiv-enda,isizulu和isixhosa的Wordnets的实施,以便在自然语言工具包(NLTK)上开放多语言Wordnets(OMW)。 非洲Wordnets从Princeton Wordnet(PWN)2.0到3.0转换为3.0以匹配PWN 3.0中的Synpsets。 转换后,Sepedi,Setswana,Tshivenda,Isizulu和Isix-hosa分别有7157%,11972,1288,6380和9460 lemmas。 Setswana,Isixhosa。 与OMW中的8种语言相比,Sepedi包含更多的LEMMAS,而ISizulu在OMW中的7种语言中包含更多的LEMMAS。 使用NLTK在OMW中的非洲字网的持续发展,已公布一个图书馆。



  • 外文文献
  • 中文文献
  • 专利


京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号