首页> 外文会议>International Conference on Artificial Intelligence >Improving Chinese Word Segmentation with Description Length Gain
【24h】

Improving Chinese Word Segmentation with Description Length Gain

机译:用描述长度增益改进中文字段

获取原文

摘要

Supervised and unsupervised learning has seldom joined with and thus lend strength to each other in the field of Chinese word segmentation (CWS). This paper presents a novel approach to CWS that utilizes description length gain (DLG), an empirical goodness measure for unsupervised word discovery, to enhance the segmentation performance of conditional random field (CRF) learning. Specifically, we attempt to integrate the lexical information acquired from the unsupervised DLG segmentation into the supervised CRF learning of character tagging for CWS. Our experimental results show that the CRF learning can be further improved on top of its state-of-the-art performance in the field by making good use of DLG information.
机译:监督和无监督的学习很少加入,从而在汉字分割(CWS)领域互相借鉴。本文提出了一种新的CWS方法,该方法利用描述长度增益(DLG),未经监督单词发现的经验良好度量,提高条件随机场(CRF)学习的分割性能。具体而言,我们尝试将从无监督的DLG分段中获取的词汇信息集成到CWS的Character标记的监督CRF学习中。我们的实验结果表明,通过利用DLG信息,可以进一步改善CRF学习在现场的最新性能之上。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号