首页> 外文会议>International conference on Asian language processing >A Hybrid Deep Learning Architecture for Sentence Unit Detection
【24h】

A Hybrid Deep Learning Architecture for Sentence Unit Detection

机译:用于句子单元检测的混合深度学习架构

获取原文

摘要

Automatic speech recognition systems currently deliver an unpunctuated sequence of words which is hard to peruse for human and degrades the performance of the downstream natural language processing tasks. In this paper, we propose a hybrid approach for Sentence Unit Detection, in which the focus is on adding the full stop [.]to the unstructured text. Our model profits from the advantage of two dominant deep learning architectures: (i)the ability to learn the long dependencies in both directions of a bidirectional Long Short-Term Memory; (ii)the ability to capture the local context with Convolutional Neural Networks. We also empirically study the training objective of our networks using extra-loss and further investigate the impacts of each model component on the overall result. Experiments conducted on two large-scale datasets demonstrated that the proposed architecture outperforms previous separated methods by a substantial margin of 1.82-1.91% of F1. Availability: the source code and model are available at https://github.com/catcd/LSTM-CNN-SUD.
机译:当前,自动语音识别系统提供了一个不打孔的单词序列,这对于人类来说是很难读懂的,并且降低了下游自然语言处理任务的性能。在本文中,我们提出了一种用于句子单位检测的混合方法,其中重点是在非结构化文本上添加句号[。]。我们的模型得益于两种主要的深度学习架构的优势:(i)在双向长短期记忆的两个方向上学习长依赖性的能力; (ii)使用卷积神经网络捕获局部上下文的能力。我们还使用额外损失来经验地研究我们的网络的训练目标,并进一步调查每个模型组件对整体结果的影响。在两个大型数据集上进行的实验表明,所提出的体系结构比以前的分离方法要好得多,为F的1.82-1.91% 1 。可用性:源代码和模型可从https://github.com/catcd/LSTM-CNN-SUD获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号