首页> 外文会议>Language and Technology Conference >Hierarchical Amharic Base Phrase Chunking Using HMM with Error Pruning
【24h】

Hierarchical Amharic Base Phrase Chunking Using HMM with Error Pruning

机译:使用HMM具有错误修剪的分层Amharic基本短语块

获取原文

摘要

Segmentation of a text into non-overlapping syntactic units (chunks) has become an essential component of many applications of natural language processing. This paper presents Amharic base phrase chunker that groups syntactically correlated words at different levels using HMM. Rules are used to correct chunk phrases incorrectly chunked by the HMM. For the identification of the boundary of the phrases IOB2 chunk specification is selected and used in this work. To test the performance of the system, corpus was collected from Amharic news outlets and books. The training and testing datasets were prepared using the 10-fold cross validation technique. Test results on the corpus showed an average accuracy of 85.31 % before applying the rule for error correction and an average accuracy of 93.75 % after applying rules.
机译:将文本分割为非重叠句法单元(块)已成为自然语言处理许多应用的重要组成部分。本文介绍了Amharic基本短语块,其中使用HMM在不同级别进行了句法相关的单词。规则用于纠正亨姆格错误地截然不错的块短语。对于识别短语,选择块规范并在这项工作中使用。为了测试系统的表现,从Amharic新闻网点和书籍中收集了语料库。使用10倍交叉验证技术制定训练和测试数据集。在申请规则后,在施用纠错规则之前,Corpus上的测试结果显示为85.31%,并且在申请规则后的平均准确性为93.75%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号