首页> 外文期刊>Natural language engineering >Improving part-of-speech tagging using lexicalized HMMs
【24h】

Improving part-of-speech tagging using lexicalized HMMs

机译:使用词法化的HMM改进词性标记

获取原文
获取原文并翻译 | 示例
           

摘要

We introduce a simple method to build Lexicalized Hidden Markov Models (L-HMMs) for improving the precision of part-of-speech tagging. This technique enriches the contextual Language Model taking into account a set of selected words empirically obtained. The evaluation was conducted with different lexicalization criteria on the Penn Treebank corpus using the TnT tagger. This lexicalization obtained about a 6% reduction of the tagging error, on an unseen data test, without reducing the efficiency of the system. We have also studied how the use of linguistic resources, such as dictionaries and morphological analyzers, improves the tagging performance. Furthermore, we have conducted an exhaustive experimental comparison that shows that Lexicalized HMMs yield results which are better than or similar to other state-of-the-art part-of-speech tagging approaches. Finally, we have applied Lexicalized HMMs to the Spanish corpus LexEsp.
机译:我们引入了一种简单的方法来构建词法化隐式马尔可夫模型(L-HMM),以提高词性标注的精度。考虑到根据经验获得的一组选定单词,此技术丰富了上下文语言模型。评估是使用TnT标签在Penn Treebank语料库上使用不同的词汇化标准进行的。在看不见的数据测试中,这种词汇化使标记错误减少了大约6%,而不会降低系统的效率。我们还研究了如何使用语言资源(例如词典和词法分析器)来提高标记性能。此外,我们进行了详尽的实验比较,表明Lexicalized HMM产生的结果优于或类似于其他最新的词性标记方法。最后,我们将词法化的HMM应用到了西班牙语料库LexEsp。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号