Automatic punctuation generation for speech

机译：语音自动标点生成

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Automatic generation of punctuation is an essential feature for many speech-to-text transcription tasks. This paper describes a Maximum A-Posteriori (MAP) approach for inserting punctuation marks into raw word sequences obtained from Automatic Speech Recognition (ASR). The system consists of an “acoustic model” (AM) for prosodic features (actually pause duration) and a “language model” (LM) for text-only features. The LM combines three components: an MLP-based trigger-word model and a forward and a backward trigram punctuation predictor. The separation into acoustic and language model allows to learn these models on different corpora, especially allowing the LM to be trained on large amounts of data (text) for which no acoustic information is available. We find that the trigger-word LM is very useful, and further improvement can be achieved when combining both prosodic and lexical information. We achieve an F-measure of 81.0% and 56.5% for voicemails and podcasts, respectively, on reference transcripts, and 69.6% for voicemails on ASR transcripts.

机译：标点符号的自动生成是许多语音转文本转录任务的基本功能。本文介绍了一种将标点符号插入从自动语音识别（ASR）获得的原始单词序列中的最大A后验（MAP）方法。该系统由用于韵律特征的“声学模型”（AM）（实际上是暂停时间）和仅针对文本特征的“语言模型”（LM）组成。 LM包含三个组件：基于MLP的触发词模型以及前向和后向三元组标点预测器。声音和语言模型的分离允许在不同的语料库上学习这些模型，尤其是允许LM在没有可用声音信息的大量数据（文本）上进行训练。我们发现触发词LM非常有用，并且在结合韵律信息和词汇信息时可以实现进一步的改进。对于参考成绩单，语音邮件和播客的F度量分别达到81.0％和56.5％，对于ASR成绩单，语音邮件的F-度量达到69.6％。

著录项

来源
《Automatic Speech Recognition amp; Understanding, 2009. ASRU 2009》|2009年|586-589|共4页
会议地点 Merano(IT);Merano(IT)
作者
Shen Wenzhu; Yu Roger Peng; Seide Frank; Wu Ji;
展开▼
作者单位

Microsoft Research Asia, 5F Beijing Sigma Center, 49 Zhichun Rd., 100080, China;

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Bilingual Experiments on Automatic Recovery of Capitalization and Punctuation of Automatic Speech Transcripts [J] . Batista F., Moniz H., Trancoso I., Audio, Speech, and Language Processing, IEEE Transactions on . 2012,第2期

机译：自动恢复大写和标点自动语音的双语实验
2. A combined punctuation generation and speech recognition system and its performance enhancement using prosody [J] . Ji-Hwan Kim, Philip C. Woodland Speech Communication . 2003,第4期

机译：标点符号生成与语音识别相结合的系统及其使用韵律的性能增强
3. Insertion methods of punctuation marks for speech [J] . Yosuke Toyama, Morio Nagata 電子情報通信学会技術研究報告. 言語理解とコミュニケーション. Natural Language Understanding and Models of Communication . 2000,第100期

机译：语音标点符号的插入方法
4. Automatic Punctuation Generation For Speech [C] . Wenzhu Shen, Roger Peng Yu, Frank Seide, IEEE Workshop on Automatic Speech Recognition Understanding . 2009

机译：用于语音的自动标点
5. Automatic Biological Protocol and Python Code Generation Using Human Speech [D] . Kapner, Kevin 2019

机译：使用人类语音的自动生物协议和Python代码生成
6. Brain-inspired speech segmentation for automatic speech recognition using the speech envelope as a temporal reference [O] . Byeongwook Lee, Kwang-Hyun Cho -1

机译：以语音包络作为时间参考的自动语音识别的大脑启发式语音分割
7. Punctuation Generation Inspired Linguistic Features for Mandarin Prosody Generation [O] . Chen-Yu Chiang, Yu-Ping Hung, Han-Yun Yeh, 2018

机译：标点符号为普通话发电启发了语言特征

Automatic punctuation generation for speech

摘要

著录项

相似文献

相关主题

期刊订阅