Towards building a better language model for SWITCHBOARD: the POStagging task

机译：致力于为SWITCHBOARD建立更好的语言模型：POS标记任务

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Language models are used extensively in state-of-the-art speechrecognition systems to help determine the probability of a hypothesizedword sequence. These probabilities, along with the acoustic modelscores, allow the system to constrain the search space duringrecognition to only those word sequences that have a reasonable chanceof being correct. In order to determine these probabilities, knowledgeof the entire problem space is necessary. However, in speechrecognition, this is an unreasonable if not impossible task, especiallywhen one is using the SWITCHBOARD corpus (a large corpus consisting ofover 240 hours of recorded telephone conversations totaling almost 3million words of text). Many statistical and rule-based approaches havebeen applied to this problem in order to arrive at a language model thatproduces the minimal word error rate (WER) of the recognizer. Onetechnique includes part-of-speech (POS) information in the languagemodel. This paper discusses the task of tagging the SWITCHBOARD corpuswith POS information in the usual manner, and the problems encounteredwhen trying to conform conversational speech to these tags

机译：语言模型在最先进的演讲中广泛使用识别系统，以帮助确定假设的概率单词序列。这些概率以及声学模型分数，允许系统在此期间约束搜索空间仅对那些具有合理机会的词序列的识别是正确的。为了确定这些概率，知识整个问题空间是必要的。但是，在演讲中认识，如果不是不可能的任务，这是一个不合理的任务，特别是当一个人使用交换机语料库时（由一个大型毒品组成超过240小时的录制电话谈话差不多3 百万字的文字）。许多统计和规则的方法都有已应用于此问题，以便到达语言模型生成识别器的最小字错误率（WER）。一技术包括语言中的语音部分（POS）信息模型。本文讨论了标记交换机语料库的任务用常规方式进行POS信息，遇到的问题尝试将对话语音符合这些标签时

著录项

来源
《Information Intelligence and Systems, 1999. Proceedings. 1999 International Conference on》||p.579-582|共4页
会议地点
作者
Hamaker J.S.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Performance of Marine Power Plant Given Generator, Main and Distribution Switchboard Failures [J] . Amit Kumar, Mangey Ram 船舶与海洋工程学报（英文版） . 2015,第004期
2. The value of environmental modelling languages for building distributed hydrological models [J] . Karssenberg Derek Hydrological ProcHydrological Processesrnesses . 2002,第14期

机译：环境建模语言对构建分布式水文模型的价值
3. Comparison of visual and textual languages via task modeling [J] . Williams Marian G, Buehler JNicholas International journal of human-computer studies . 1999,第1期

机译：通过任务建模比较视觉和文字语言
4. LanguageCrawl: a generic tool for building language models upon common Crawl [J] . Roziewski Szymon, Kozlowski Marek Language Resources and Evaluation . 2021,第4期

机译：LanguagerChaw：常见爬行时构建语言模型的通用工具
5. Towards building a better language model for SWITCHBOARD: the POS tagging task [C] . Hamaker, J.S. Information Intelligence and Systems, 1999. Proceedings. 1999 International Conference on . 1999

机译：致力于为SWITCHBOARD建立更好的语言模型：POS标记任务
6. The effect of individual and task characteristics on Unified Modeling Language use: A task-technology fit perspective. [D] . Grossman, Martin. 2003

机译：个人和任务特征对统一建模语言使用的影响：任务技术的契合度。
7. Building Gold Standard Corpora for Medical Natural Language Processing Tasks [O] . Louise Deleger, Qi Li, Todd Lingren, 2012

机译：构建用于医学自然语言处理任务的金标准语料库
8. The value of environmental modelling languages for building distributed hydrological models [O] . Derek Karssenberg 2002

机译：建筑分布式水文模型环境建模语言的价值

Towards building a better language model for SWITCHBOARD: the POStagging task

摘要

著录项

相似文献

相关主题

期刊订阅