【24h】

Is Local Window Essential for Neural Network Based Chinese Word Segmentation?

机译:局部窗口对于基于神经网络的中文分词是否必不可少?

获取原文

摘要

Neural network based Chinese Word Segmentation (CWS) approaches can bypass the burdensome feature engineering comparing with the conventional ones. All previous neural network based approaches rely on a local window in character sequence labelling process. It can hardly exploit the outer context and may preserve indifferent inner context. Moreover, the size of local window is a toilsome manual-tuned hyper-parameter that has significant influence on model performance. We are wondering if the local window can be discarded in neural network based CWS. In this paper, we present a window-free Bi-directional Long Short-term Memory (Bi-LSTM) neural network based Chinese word segmentation model. The model takes the whole sentence under consideration to generate reasonable word sequence. The experiments show that the Bi-LSTM can learn sufficient context for CWS without the local window.
机译:与传统方法相比,基于神经网络的中文分词(CWS)方法可以绕过繁琐的特征工程。所有以前的基于神经网络的方法在字符序列标记过程中都依赖于局部窗口。它几乎无法利用外部环境,并且可以保留无差异的内部环境。此外,局部窗口的大小是繁琐的手动调整的超参数,对模型性能具有重大影响。我们想知道是否可以在基于神经网络的CWS中丢弃局部窗口。在本文中,我们提出了基于中文分词模型的无窗双向长期短时记忆(Bi-LSTM)神经网络。该模型考虑了整个句子,以生成合理的单词序列。实验表明,Bi-LSTM可以在没有本地窗口的情况下为CWS学习足够的上下文。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号