首页> 外文OA文献 >Developing a corpus-based grammar model within a continuous commercial speech recognition package
【2h】

Developing a corpus-based grammar model within a continuous commercial speech recognition package

机译:在连续商业语音识别包中开发基于语料库的语法模型

摘要

This paper is derived from experiments with a commercial ’off-the-shelf’ continuous speech recognition system, applied to the apparently restricted domain of Air Traffic Control (ATC) for light aircraft. The system is required to transcribe key sub-phrases in a transmission by the ATC to a particular aircraft, the commercial speech recognition system providing the main recognition component. After the development of a corpus of transmissions, it was realised that key information is often interspersed with unconstrained English. Initial attempts focused on using a wildcard mechanism for the non-key sub- phrases. The mechanism, however, proved to be valuable only in simplistic grammars due to its overgenerative nature. The speech recognition system showed us that whilst useful mechanisms are provided, such as the wildcard mechanism, they tend to make over-simplistic assumptions about English grammar and dialogue structure.
机译:本文来自使用商用“现成”连续语音识别系统进行的实验,该系统已应用于轻型飞机的空中交通管制(ATC)受到明显限制的领域。要求该系统转录ATC到特定飞机的传输中的关键子短语,商业语音识别系统提供了主要的识别组件。随着传输语料库的发展,人们意识到关键信息常常散布在不受约束的英语中。最初的尝试着重于对非关键子短语使用通配符机制。但是,由于该机制的泛生性,它仅在简单的语法中被证明是有价值的。语音识别系统向我们显示,尽管提供了有用的机制(例如通配符机制),但它们倾向于对英语语法和对话结构做出过于简单的假设。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号