【24h】

A Study of Punctuation Prediction Model for Domain-free Text

机译:域名文本标点符号预测模型研究

获取原文

摘要

Most automatic speech recognition systems (ASR) outputunpunctuated text. Lack of punctuation causes some problems onboth human reading and off-the-shelf natural language processingalgorithms. It is an important issue to add punctuation in therecognized text automatically. People’s communication is usuallycross many domains, each of domain has its own lexicon andwriting styles. For this reason, it is helpful for punctuationprediction with considering domain information. In previousstudies, main approaches are using acoustic feature and textualfeature, such as the part of speech tag, word vector [1], pausedurations between words [2], pitch and so on [3]. However, fewstudies take into account of individual properties of differentdomains. In this paper, we firstly testified the effects of t thedomain information on punctuation prediction task, then proposeda punctuation prediction model based on multi-task learning(MTL) approach with domain information.
机译:大多数自动语音识别系统(ASR)输出未处理文本。缺乏标点符号会导致一些问题人类阅读和现成的自然语言处理算法。添加标点符号是一个重要问题自动识别的文本。人们的通信通常是跨越许多域,每个域都有自己的词典和写样式。因此,它有助于标点符号考虑域信息预测。在以前研究,主要方法正在使用声学特征和文本特征,例如语音标签的一部分,Word Vector [1],暂停单词[2],音高等之间的持续时间[3]。但是,很少研究考虑了不同的个人属性域名。在本文中,我们首先作证了t效果关于标点符号预测任务的域信息,然后提出基于多任务学习的标点符号预测模型(MTL)域信息的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号