首页> 外文会议>IEEE International Conference on Signal, Information and Data Processing >Using bidirectional LSTM with BERT for Chinese punctuation prediction
【24h】

Using bidirectional LSTM with BERT for Chinese punctuation prediction

机译:结合BERT使用双向LSTM进行中文标点预测

获取原文

摘要

Punctuation prediction is an important step in the post processing of ASR systems. Lack of punctuation text is usually difficult to read and understand. In this paper, we propose a method based on Chinese punctuation prediction by combining the Bidirectional Long Short-Term Memory (BLSTM) and the Bidirectional Encoder Representations from Transformers (BERT), which makes the use of BERT as text encoding layers for learning contextualized word representations for improving the performance of BLSTM network. Compared with the previous punctuation prediction methods based on Recurrent Neural Network (RNN), our method improves the performance of punctuation prediction with the powerful ability of capturing semantics and long-distance dependencies in Chinese unsegmented text. Our experimental results on Chinese news datasets that our BERT-BLSTM based method outperforms the baseline by up to 31.07% absolute in overall micro-F1.
机译:标点预测是ASR系统后处理中的重要步骤。缺少标点符号的文本通常很难阅读和理解。在本文中,我们提出了一种基于中文标点预测的方法,该方法结合了双向长短期记忆(BLSTM)和来自变压器的双向编码器表示(BERT),该方法利用BERT作为文本编码层来学习上下文化单词用于提高BLSTM网络性能的表示形式。与以前基于递归神经网络(RNN)的标点符号预测方法相比,我们的方法具有捕获中文未分段文本中语义和长距离依赖项的强大功能,从而提高了标点符号预测的性能。我们在中文新闻数据集上的实验结果表明,基于BERT-BLSTM的方法在总体micro-F1上绝对比基线高31.07%的绝对值。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号