An invetigation of variable block length methods for calculation fo spectral/temporal features for automatic speech recognition

机译：研究用于语音自动识别的频谱/时间特征的可变块长方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents an investigation of non-uniform time sampling methods for spectral/temporal feature extraction for use in automatic speech recognition. In most current methods for signal modeling of speech information. "dynamic" features are determined from frame-based parameters using a fixed time sampling, i.e. fixed block length and fixed block spacing. This work explores new methods in which block length and or block spacing are variable. Three methods are suggested and each was tested with the TIMIT database using a standard HMM recognizer. Phone recognition experiments were conducted using the standard 39 phone set. The methods were also evaluated with various HMM model complexities. Experimental results indicated that none of the proposed non-uniform feature time sampling methods perform significantly better than fixed time sampling methos. However. The best results obtained with the front end are comparable to those obtained with current state-of-the-art systems. Also the performance of our monophone system surpasses that of most reported context-dependent monophone systems.

机译：本文介绍了用于自动语音识别的频谱/时间特征提取的非均匀时间采样方法。在大多数当前的语音信息信号建模方法中。使用固定的时间采样，即固定的块长度和固定的块间隔，从基于帧的参数确定“动态”特征。这项工作探索了块长度和/或块间距可变的新方法。提出了三种方法，每种方法都使用标准的HMM识别器在TIMIT数据库中进行了测试。使用标准39电话机进行电话识别实验。还用各种HMM模型复杂度评估了这些方法。实验结果表明，所提出的非均匀特征时间采样方法均没有明显优于固定时间采样方法。然而。前端获得的最佳结果可与当前最新系统获得的结果相媲美。同样，我们的单声道电话系统的性能也超过了大多数报道的上下文相关的单声道电话系统。

著录项

来源
《6th International Conference on Spoken Language Processing ICSLP 2000 Oct.16.-Oct.20 2000 Beijing International Convention Center,Beijing, China》|2000年|p.700-703|共4页
会议地点
作者
Montri Karnjanadecha; Stephen A.Zahorian;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类世界各国文化与文化事业;
关键词

相似文献

外文文献
中文文献
专利

1. Spectral and Temporal Envelope Cues for Human and Automatic Speech Recognition in Noise [J] . Journal of the Association for Research in Otolaryngology: JARO . 2020,第1期

机译：噪声中的人类和自动语音识别的光谱和颞包络线
2. Feature selection for robust automatic speech recognition: a temporal offset approach [J] . Ludovic Trottier, Philippe Giguere, Brahim Chaib-draa International journal of speech technology . 2015,第3期

机译：强大的自动语音识别功能选择：时间偏移方法
3. Nonlinear spectro-temporal features based on a cochlear model for automatic speech recognition in a noisy situation [J] . ChoiY.-S., LeeS.-Y. Neural Networks: The Official Journal of the International Neural Network Society . 2013,第Null期

机译：基于人工耳蜗模型的非线性时空特征在嘈杂情况下的自动语音识别
4. An invetigation of variable block length methods for calculation fo spectral/temporal features for automatic speech recognition [C] . Montri Karnjanadecha, Stephen A.Zahorian International conference on spoken language processing . 2000

机译：用于计算自动语音识别的频谱/时间特征的可变块长度方法的研究
5. An automatic speech recognition oriented study on segmentation, low dimensional feature extraction, and temporal trajectory information capture. [D] . Zhu, Yonggang. 2002

机译：面向语音识别的自动研究，涉及分割，低维特征提取和时间轨迹信息捕获。
6. Spectral and Temporal Envelope Cues for Human and Automatic Speech Recognition in Noise [O] . Guangxin Hu, Sarah C. Determan, Yue Dong, 2020

机译：用于噪声中的人类和自动语音识别的光谱和颞包络线
7. Spectro-temporal Features for Automatic Speech Recognition using Linear Prediction in Spectral Domain [O] . Ganapathy Sriram, Hermansky H., Thomas Samuel 2008

机译：光谱域中使用线性预测的自动语音识别的光谱时特征

An invetigation of variable block length methods for calculation fo spectral/temporal features for automatic speech recognition

摘要

著录项

相似文献

相关主题

期刊订阅