【24h】

Part-of-Speech Tagger Based on Maximum Entropy Model

机译:基于最大熵模型的词性标注

获取原文

摘要

The maximum entropy (ME) conditional models don’t force to adhere to the independence assumption such as in Hidden Markov generative models, and thus the ME-based Part-of-Speech (POS) tagger can depend on arbitrary, nonindependent features, which are benefit to the POS tagging, without accounting for the distribution of those dependencies. Since ME models are able to flexibly utilize a wide variety of features, the sparse problem of training data is efficiently solved. Experiments show that the POS tagging error rate is reduced by 54.25% in close test and 40.56% in open test over the Hidden-Markov-Model-based baseline, and synchronously an accuracy of 98.01% in close test and 95.56% in open test are obtained.
机译:最大熵(ME)条件模型不会强迫遵守独立性假设,例如在隐马尔可夫生成模型中,因此基于ME的词性(POS)标记器可以依赖于任意的,非独立的特征,这些特征对POS标记有好处,而无需考虑这些依赖项的分布。由于ME模型能够灵活地利用多种功能,因此有效地解决了训练数据的稀疏问题。实验表明,与基于隐马尔可夫模型的基准相比,封闭测试中的POS标记错误率降低了54.25%,开放测试中的POS标记错误率降低了40.56%,同步地,封闭测试中的准确性为98.01%,开放测试中的准确性为95.56%。获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号