Fast Syntactic Analysis for Statistical Language Modeling via Substructure Sharing and Uptraining

机译：通过子结构共享和训练对统计语言建模进行快速语法分析

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Long-span features, such as syntax, can improve language models for tasks such as speech recognition and machine translation. However, these language models can be difficult to use in practice because of the time required to generate features for rescoring a large hypothesis set. In this work, we propose substructure sharing, which saves duplicate work in processing hypothesis sets with redundant hypothesis structures. We apply substructure sharing to a dependency parser and part of speech tagger to obtain significant speedups, and further improve the accuracy of these tools through up-training. When using these improved tools in a language model for speech recognition, we obtain significant speed improvements with both TV-best and hill climbing rescoring, and show that up-training leads to WER reduction.

机译：大范围的功能（例如语法）可以改善语音识别和机器翻译等任务的语言模型。但是，由于生成与大型假设集对应的特征所需的时间，这些语言模型在实践中可能难以使用。在这项工作中，我们提出了子结构共享，它可以节省处理带有冗余假设结构的假设集时的重复工作。我们将子结构共享应用于依赖解析器和语音标记器的一部分，以获得显着的加速，并通过向上训练进一步提高这些工具的准确性。当在语言模型中使用这些经过改进的工具进行语音识别时，我们在电视最佳和爬山记录方面都获得了显着的速度改进，并表明向上训练可以降低WER。

著录项

来源
《Annual meeting of the Association for Computational Linguistics;ACL 2012》|2012年|p.175-183|共9页
会议地点
作者
Ariya Rastrow; Mark Dredze; Sanjeev Khudanpur;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;程序设计、软件工程;
关键词

相似文献

外文文献
中文文献
专利

1. Three Types of Episodic Associations for the Semantic/Syntactic/Episodic Model of Language Prospective in Applications to the Statistical Translation [J] . Zi-Jian Cai Open Access Library Journal . 2017,第8期

机译：语义/句法/情节性语言模型的三种情景关联在统计翻译中的应用
2. Syntactic discriminative language model rerankers for statistical machine translation [J] . Simon Carter, Christof Monz Machine translation . 2011,第4期

机译：用于统计机器翻译的句法歧视性语言模型重排
3. Statistical analysis of orthographic and phonemic language corpus for word-based and phoneme-based Polish language modelling [J] . Piotr K?osowski EURASIP journal on audio, speech, and music processing . 2017,第1期

机译：基于单词和音素的波兰语语言建模的正字法和音位语料库的统计分析
4. Fast Syntactic Analysis for Statistical Language Modeling via Substructure Sharing and Uptraining [C] . Ariya Rastrow, Mark Dredze, Sanjeev Khudanpur Annual meeting of the Association for Computational Linguistics . 2012

机译：通过子结构共享和上部统计语言建模的快速句法分析
5. Practical and efficient incorporation of syntactic features into statistical language models. [D] . Rastrow, Ariya. 2012

机译：实用有效地将句法特征合并到统计语言模型中。
6. Statistical Analysis on Determinant Factors Associated with Time to Death of HIV/TB Co-Infected Patients Under HAART at Debre Tabor Referral Hospital: An Application of Accelerated Failure Time-Shared Frailty Models [O] . Hailegebrael Birhan, Kenaw Derebe, Setegn Muche, 2021

机译：德布鲁塔博尔推荐医院HIV / TB Co-Covered患者死亡时间因子的统计分析：加速失效时间共享模型的应用
7. CoNLL 2014 Shared Task: Grammatical Error Correction with a Syntactic N-gram Language Model from a Big Corpora [O] . S. David Hernandez, Centro De Investigación En, Computación Ipn México, 2015

机译：CoNLL 2014共享任务：使用来自大型语料库的句法N-gram语言模型进行语法错误纠正

Fast Syntactic Analysis for Statistical Language Modeling via Substructure Sharing and Uptraining

摘要

著录项

相似文献

相关主题

期刊订阅