【24h】

Transition-Based Neural Word Segmentation

机译:基于过渡的神经词分割

获取原文

摘要

Character-based and word-based methods are two main types of statistical models for Chinese word segmentation, the former exploiting sequence labeling models over characters and the latter typically exploiting a transition-based model, with the advantages that word-level features can be easily utilized. Neural models have been exploited for character-based Chinese word segmentation, giving high accuracies by making use of external character embeddings, yet requiring less feature engineering. In this paper, we study a neural model for word-based Chinese word segmentation, by replacing the manually-designed discrete features with neural features in a word-based segmentation framework. Experimental results demonstrate that word features lead to comparable performances to the best systems in the literature, and a further combination of discrete and neural features gives top accuracies.
机译:基于字符的方法和基于单词的方法是中文分词的两种主要统计模型,前者利用序列标签模型而不是字符,后者通常利用基于过渡的模型,其优点是可以轻松地实现词级特征利用。神经模型已被用于基于字符的中文分词,通过使用外部字符嵌入来提供较高的准确性,但所需的特征工程更少。在本文中,我们研究了一种基于单词的中文分词的神经模型,方法是在基于单词的分词框架中将人工设计的离散特征替换为神经特征。实验结果表明,单词特征可导致与文献中最佳系统可比的性能,而离散特征和神经特征的进一步组合可提供最高的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号