首页> 中文期刊> 《计算机应用》 >基于改进的多层BLSTM的中文分词和标点预测

基于改进的多层BLSTM的中文分词和标点预测

             

摘要

The current mainstream sequence labeling is based on Recurrent Neural Network (RNN).Aiming at the problem of RNN and sequence labeling,an improved multilayer Bi-direction Long Short Term Memory (BLSTM) network for sequence labeling was proposed.Each layer of BLSTM had an operation of information fusion,and the output contained more contextual information.In addition,a method to perform Chinese word segmentation and punctuation prediction jointly was proposed.Experiments on the public datasets show that the improved multilayer BLSTM network model can improve the classification accuracy of Chinese segmentation and punctuation prediction.In the case of two tasks that need to be accomplished,the joint task method can greatly reduce the complexity of the system,and the new model and the joint task method can also be applied to solve other sequence labeling problems.%目前主流的序列标注问题是基于循环神经网络(RNN)实现的.针对RNN和序列标注问题进行研究,提出了一种改进型的多层双向长短时记忆(BLSTM)网络,该网络每层的BLSTM都有一次信息融合,输出包含更多的上下文信息.另外找到一种基于序列标注的可以并行执行中文分词和标点预测的联合任务方法.在公卉的数据集上的实验结果表明,所提出的改进型的多层BLSTM网络模型性能优越,提升了中文分词和标点预测的分类精度;在需要完成中文分词和标点预测两项任务时,联合任务方法能够大幅地降低系统复杂度;新的模型及基于该模型的联合任务方法也可应用到其他序列标注任务中.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号