Nonlinear time-frequency distributions of spectrum energy operator in large vocabulary mandarin speaker independent speech recognition system

Ai-Dulaimy Fadhil H. T.; Wang Zuoying

首页> 外文期刊>Tsinghua Science and Technology >Nonlinear time-frequency distributions of spectrum energy operator in large vocabulary mandarin speaker independent speech recognition system

【24h】

Nonlinear time-frequency distributions of spectrum energy operator in large vocabulary mandarin speaker independent speech recognition system

机译：大型汉语普通话独立语音识别系统中频谱能量算子的非线性时频分布

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This work demonstrates the use of the nonlinear time-frequency distribution (NLTFD) of a discrete time energy operator (DTEO) based on amplitude demodulation modulation demodulation techniques as a feature in speech recognition. The duration distribution based hidden Markov module in a speaker independent large vocabulary mandarin speech recognition system was reconstructed from the feature vectors in the front-end detection stage. The goal was to improve the performance of the existing system by corrbining new features to the baseline ne feature vector. This paper also deals with errors associated with using a pre-emphasis filter in the front end processing of the present scheme, which causes an increase in the noise energy at high frequencies above 4 kHz and in some cases degrades the recognition accuracy. The experimental results show that eliminating the pre-emphasis filters from the pre-processing stage and using NLTFD with compensated DTEO combined with Mel frequency cepstrum components give a 21. 95% reduction in the relative error rate corrpared to the conventional technique with 25 candidates used in the test.

机译：这项工作演示了基于幅度解调调制解调技术作为语音识别功能的离散时间能量算子（DTEO）的非线性时频分布（NLTFD）的使用。在前端检测阶段，从特征向量重构了说话人无关的大词汇量普通话语音识别系统中基于持续时间分布的隐马尔可夫模块。目的是通过将新特征组合到基线和特征向量上来改善现有系统的性能。本文还处理了与在本方案的前端处理中使用预加重滤波器相关的误差，这会导致在4 kHz以上的高频下噪声能量增加，并且在某些情况下会降低识别精度。实验结果表明，从预处理阶段取消预加重滤波器，并使用带补偿DTEO的NLTFD和梅尔频率倒谱分量的组合，相对误差率降低了21. 95％，与传统技术相比，减少了25个候选值在测试中。

著录项

来源
《Tsinghua Science and Technology》 |2003年第6期|p.667-671|共5页
作者
Ai-Dulaimy Fadhil H. T.; Wang Zuoying;
展开▼
作者单位

Department of Electronic Engineering, Tsinghua University, Beijing 100084, China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Large vocabulary speech recognition; duration distribution based hidden markov module; energy operator; robust feature;

机译：大词汇量语音识别;基于持续时间分布的隐马尔可夫模块;能量算子;鲁棒性;

相似文献

外文文献
中文文献
专利

1. Nonlinear Time-Frequency Distributions of Spectrum Energy Operator in Large Vocabulary Mandarin Speaker Independent Speech Recognition System [J] . Fadhil H. T. Al-dulaimy, WANG Zuoying(王作英) 清华大学学报（英文版） . 2003,第006期

机译：大词汇量普通话独立语音识别系统中频谱能量算子的非线性时频分布
2. Adaptive compensation algorithm in open vocabulary mandarin speaker-independent speech recognition [J] . Al-dulaimy Fadhil H. T., Wang Zuoying, Tian Ye Tsinghua Science and Technology . 2002,第5期

机译：开放式普通话独立于说话人的语音识别中的自适应补偿算法
3. Adaptive Compensation Algorithm in Open Vocabulary Mandarin Speaker-Independent Speech Recognition [J] . 清华大学学报（英文版） . 2002,第005期

机译：开放词汇普通话独立语音识别中的自适应补偿算法
4. Confidence Measure (CM) Estimation for Large Vocabulary Speaker-Independent Continuous Speech Recognition System [C] . Yaxin ZHANG, Raymond LEE, Anton MADIEVSKI European conference on speech communication and technology . 2001

机译：大型词汇扬声器无关连续语音识别系统的置信度量（CM）估计
5. Large-vocabulary speaker-independent continuous speech recognition: The SPHINX system. [D] . Lee, Kai-Fu. 1988

机译：独立于大词汇的说话者的连续语音识别：SPHINX系统。
6. The Binaural Masking-Level Difference of Mandarin Tone Detection and the Binaural Intelligibility-Level Difference of Mandarin Tone Recognition in the Presence of Speech-Spectrum Noise [O] . Cheng-Yu Ho, Pei-Chun Li, Yuan-Chuan Chiang, -1

机译：语音频谱噪声下普通话检测的双耳掩蔽水平差异和普通话识别的双耳可懂度水平差异
7. Speaker Independent Phonetic Transcription of Fluent Speech for Large Vocabulary Speech Recognition [O] . S. E. Levinson, M. Y. Liberman, A. Ljolje, 1989

机译：用于大词汇量语音识别的流利语音的说话人独立语音转录
8. Software Package for Speaker Independent or Dependent Speech Recognition UsingStandard Objects for Phonetic Speech Recognition [R] . Pfister, M. 1998

机译：使用标准对象进行语音识别的扬声器独立或相关语音识别软件包

Nonlinear time-frequency distributions of spectrum energy operator in large vocabulary mandarin speaker independent speech recognition system

摘要

著录项

相似文献

相关主题

期刊订阅