Improved Models for Automatic Punctuation Prediction for Spoken and Written Text

机译：改进的自动标点符号预测模型，用于口语和书面文本

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents improved models for the automatic prediction of punctuation marks in written or spoken text. Various kinds of textual features are combined using Conditional Random Fields. These features include language model scores, token n-grams, sentence length, and syntactic information extracted from parse trees. The resulting models are evaluated on several different tasks, ranging from formal newspaper text to informal, dictated messages and documents, and from written text to spoken text. The newly developed models outperform a hidden-event language model by up to 26% relative in F-score. Evaluation of punctuation prediction on erroneous ASR output as well as evaluation against multiple references is not straightforward. We propose modifications of existing evaluation methods to handle these cases.

机译：本文介绍了在书面或口语文本中自动预测标点符号的改进模型。使用条件随机字段组合各种文本功能。这些功能包括语言模型分数，令牌n-gram，句子长度和从解析树中提取的句法信息。由此产生的模型在几个不同的任务中进行评估，从正式报纸文本到非正式的，决定的消息和文档，以及从书面文本到口语文本。新开发的模型优于隐藏事件语言模型，在F分数中可达26％。对错误ASR输出的标点符号预测的评估以及对多引用的评估并不简单。我们建议修改现有的评估方法来处理这些情况。

著录项

来源
《Conference of the International Speech Communication Association》|2013年||共5页
会议地点
作者
Nicola Ueffing; Maximilian Bisani; Paul Vozila;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TN912.3-532;
关键词
punctuation; prediction; models;

机译：标点符号;预测;模型;

相似文献

外文文献
中文文献
专利

1. Automatic prediction of intelligibility of English words spoken with Japanese accents - Comparative study of features and models used for prediction [J] . Teeraphon PONGKITTIPHAN, Nobuaki MINEMATSU, Takehiko MAKINO, 電子情報通信学会技術研究報告. 音声. Speech . 2014,第411期

机译：自动预测具有日语口音的英语单词的清晰度-用于预测的特征和模型的比较研究
2. MA20.09 Improved Lung Cancer and Mortality Prediction Accuracy Using Survival Models Based on Semi-Automatic CT Image Measurements [J] . A. Schreuder, C. Jacobs, N. Lessmann, Journal of thoracic oncology: official publication of the International Association for the Study of Lung Cancer . 2018,第10期

机译：使用基于半自动CT图像测量的生存模型改善肺癌和死亡率预测精度改善
3. Automatic prediction of flexible regions improves the accuracy of protein-protein docking models [J] . Luo X., Lü Q., Wu H., Journal of molecular modeling . 2012,第5期

机译：柔性区域的自动预测可提高蛋白质-蛋白质对接模型的准确性
4. Improved Models for Automatic Punctuation Prediction for Spoken and Written Text [C] . Nicola Ueffing, Maximilian Bisani, Paul Vozila Conference of the International Speech Communication Association . 2013

机译：改进的自动标点符号预测模型，用于口语和书面文本
5. Improving high quality concatenative text-to-speech synthesis using the circular linear prediction model. [D] . Shukla, Sunil Ravindra. 2007

机译：使用圆形线性预测模型改善高质量的串联文本到语音合成。
6. Does Emotional Disclosure About Stress Improve Health in Rheumatoid Arthritis? Randomized Controlled Trials of Written and Spoken Disclosure [O] . Mark A. Lumley, James C.C. Leisen, R. Ty Partridge, -1

机译：情绪披露是否有关于压力改善类风湿性关节炎的健康状况？书面和说话的随机受控试验
7. Automatic Sentence Segmentation and Punctuation Prediction for Spoken Language Translation [O] . Matusov Evgeny, Mauser Arne, Ney Hermann 2006

机译：口语翻译的自动句段和标点预测
8. Using Written Language Training Data for Spoken Language Modeling. [R] . Schwartz, R., Nguyen, L., Kubala, F., 1994

机译：使用书面语言训练数据进行口语建模。

Improved Models for Automatic Punctuation Prediction for Spoken and Written Text

摘要

著录项

相似文献

相关主题

期刊订阅