首页> 外文会议>Pacific Asia Conference on Language, Information and Computation >A Study of the Effectiveness of Suffixes for Chinese Word Segmentation
【24h】

A Study of the Effectiveness of Suffixes for Chinese Word Segmentation

机译:后缀对中文分词的有效性研究

获取原文

摘要

We investigate whether suffix related features can significantly improve the performance of character-based approaches for Chinese word segmentation (CWS). Since suffixes are quite productive in forming new words, and OOV is the main error source for CWS, many researchers expect that suffix information can further improve the performance. With this belief, we tried several suffix related features in both generative and discriminative approaches. However, our experiment results have shown that significant improvement can hardly be achieved by incorporating suffix related features into those widely adopted surface features, which is against the commonly believed supposition. Error analysis reveals that the main problem behind this surprising finding is the conflict between the degree of reliability and the coverage rate of suffix related features.
机译:我们调查与后缀相关的功能是否可以显着提高基于字符的中文分词(CWS)方法的性能。由于后缀在形成新单词方面非常有效率,并且OOV是CWS的主要错误来源,因此许多研究人员期望后缀信息可以进一步提高性能。基于这种信念,我们在生成方法和判别方法中都尝试了几种与后缀相关的功能。然而,我们的实验结果表明,将后缀相关的特征合并到那些被广泛采用的表面特征中很难实现显着的改进,这与通常认为的假设相反。错误分析表明,这一令人惊讶的发现背后的主要问题是可靠性程度与后缀相关功能的覆盖率之间的冲突。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号