首页> 美国政府科技报告 >A phoneme based speech recognition system for high stress moderate noise environments
【24h】

A phoneme based speech recognition system for high stress moderate noise environments

机译:基于音素的语音识别系统,适用于高应力中等噪声环境

获取原文

摘要

The main goal of this project was to develop for NASA a high performance phoneme based speech recognition development testbed, and by so doing improve SSI's (Speech Systems Incorporated) core technology. The main areas of research were: (1) improvements for NASA specific requirements, and (2) general performance improvements to SSI's speech recognizer. NASA's requirements included primarily performance improvements in stressful and noisy speaking environments. These requirements are important to NASA's potential applications, but are also important to applications other than NASA's. The general performance research included studies to improve the speed, accuracy, and speaker coverage and/or independence of SSI's speech recognizer. Several areas of research have been investigated as part of this project. A speaker model adaptation method, called profiling, has been studied. This method makes it possible to develop generic speech models with data from many speakers, and then to customize speech models for a particular speech environment such as a new speaker, group of speakers, application, dialect, or level of background noise. Profiling was found to be successful in adapting a generic model for General American English to speakers of the Southern dialect. It was also successfully used to recover most of the performance lost in noisy environment tests. A part of the profiling algorithm was used to adapt to a new headset input device. Performance of the generic models, the starting point for customization, has also been improved. Research was performed in several areas of acoustic processing, to further increase the system's noise robustness. A constant framing rate acoustic processor was found to be superior to the former pitch synchronous framing, and better suited to model adaptation. Smoothing methods on acoustic parameters improved the performance of other stages of processing following the acoustic processor. An alternate scheme of acoustic parameterization, using cepstrum parameterization instead of bandpass filterbank parameterization, was found to yield essentially equivalent performance. Parallel processing was investigated as a means of decreasing the speech decoding time. SSI's Phonetic Decoder was reengineered in preparation for porting to a parallel processor. The result of this effort was to decrease decoding time by about 75 - 85 percent. An expression was derived for estimating the decoding time change from porting to a parallel processing environment. A system incorporating the results of this research has been consolidated, delivered and installed at NASA Johnson Space Center, with performance improvements similar to those predicted. Potential commercial applications of this research include any real speech recognition applications requiring robust, accurate, and fast recognition in stressful, noisy, or most any other speech environment.

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号