首页> 外文会议>IEEE International Conference on Computer and Communications >Word-level BERT-CNN-RNN Model for Chinese Punctuation Restoration
【24h】

Word-level BERT-CNN-RNN Model for Chinese Punctuation Restoration

机译:汉语标点恢复的字级BERT-CNN-RNN模型

获取原文

摘要

Punctuation restoration in speech recognition has a wide range of application scenarios. Despite the widespread success of neural networks methods at performing punctuation restoration for English, there have been only limited attempts for Chinese punctuation restoration. Due to the differences between Chinese and English in terms of grammar and basic semantic units, existing methods for English is not suitable for Chinese punctuation restoration. To tackle this problem, we propose a hybrid model combining the kernel of Bidirectional Encoder Representations from Transformers (BERT), Convolution Neural Network (CNN) and Recurrent Neural Network (RNN). This model employs a flexible structure and special CNN design which can extract word-level features for Chinese language. We compared the performance of the hybrid model with five widely-used punctuation restoration models on the public dataset. Experimental results demonstrate that our hybrid model is simple and efficient. It outperforms other models and achieves an accuracy of 69.1%.
机译:语音识别中的标点符号恢复具有广泛的应用方案。尽管神经网络方法在表演标点符号恢复时普遍取得了普遍的成功,但只有有限的中国标点符号恢复尝试。由于语法和基本语义单元方面的汉语和英语的差异,现有的英语方法不适合中国标点符号恢复。为了解决这个问题,我们提出了一个混合模型相结合的双向编码代表处的内核从变形金刚(BERT),卷积神经网络(CNN)和递归神经网络(RNN)。该模型采用灵活的结构和特殊的CNN设计,可以提取汉语的字级功能。我们将混合模型与公共数据集上的五个广泛使用的标点恢复模型进行了比较。实验结果表明,我们的混合模型简单富有高效。它优于其他型号,实现了69.1%的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号