首页> 外文会议>2011 IEEE International Conference on Computer Science and Automation Engineering >Using Hidden Markov Model to improve the accuracy of Punjabi POS tagger
【24h】

Using Hidden Markov Model to improve the accuracy of Punjabi POS tagger

机译:使用隐马尔可夫模型提高旁遮普POS标记器的准确性

获取原文

摘要

POS tagger is the process of assigning a correct tag to each word of the sentence. Accuracy of all NLP tasks like grammar checker, phrase chunker, machine translation etc. depends upon the accuracy of the POS tagger. We attempted to improve the accuracy of existing Punjabi POS tagger. This POS tagger lacks in resolving the ambiguity of compound and complex sentences. A Bi-gram Hidden Markov Model has been used to solve the part of speech tagging problem. An annotated corpus of 20,000 words was used for training and estimating of HMM parameter. Maximum likelihood method has been used to estimate the parameter. This HMM approach has been implemented by using Viterby algorithm. A module has been developed that takes the existing POS tagger output as input and assign the correct tag to the words having more than one tag. Our module was tested on the corpus containing 26,479 words. The accuracy of 90.11% was evaluated using manual approach.
机译:POS标记是为句子的每个单词分配正确标记的过程。所有NLP任务等语法检查器,短语块,机器转换等的准确性取决于POS标记器的准确性。我们试图提高现有Punjabi POS标记器的准确性。这个POS标记缺乏解决复合句子的歧义。一只双克隐形马尔可夫模型已被用来解决语音标记问题的一部分。用于培训和估算HMM参数的注释语料库。最大似然方法已用于估计参数。通过使用维特比算法来实现这种HMM方法。已经开发了一个模块,它将现有的POS标记输出为输入,并将正确的标记分配给具有多个标签的单词。我们的模块在包含26,479个单词的语料库上进行了测试。使用手动方法评估90.11%的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号