Temporal patterns of frequency-localized features in ASR.

机译：ASR中频率局部化特征的时间模式。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

This work investigates the use of frequency-localized temporal patterns of the speech signal for developing robust front-end for Automatic Speech Recognition (ASR). Various linear transforms are investigated for parameterization of the frequency-localized temporal patterns. We show that temporal patterns closely follow the properties of a first-order Markov process, which results in the PCA transforms being very close to the DCT transform. Better recognition performance is achieved on using the DCT components of temporal patterns as opposed to directly using temporal patterns for feature estimation. Other linear transforms such as Linear Discriminant Analysis (LDA) are also studied for the parameterization. The parameterized TempoRA1 Patterns (TRAPS) are used to estimate broad-phonetic clans-posteriors independently in each critical-band. These class-posteriors are combined and used as the features for word recognition. Our work shows that broad-phonetic features generalize better than other conventional features and yield considerable complementary information with respect to short-term cepstral features in ASR. Two practical applications are proposed for the broad-phonetic TRAPS features: (1) Distributed Speech Recognition (DSR) in cellular telephony, (2) Voice Activity Detection (VAD) tanks. These features yield a significant improvement in the performance for these applications. New band-independent categories are proposed which represent distinct speech-events in the frequency-localized temporal patterns of the speech signal. These categories are obtained by clustering the mean temporal patterns of context-independent phones using an agglomerative hierarchical clustering technique. A Universal TempoRAl PatternS (UTRAPS) system is proposed for the speech-event class-posteriors estimation. Combining UTRAPS features with cepstral features achieves a significant improvement in the recognition performance under noisy conditions. Finally, this work studies the effect of broadening the frequency-context on TRAPS features and ASR. This study shows that combining temporal patterns from more than one critical-band is important to achieve higher recognition rates.

机译：这项工作研究了使用语音信号的频率局部时间模式来开发用于自动语音识别（ASR）的强大前端。研究了各种线性变换，以对频率局部的时间模式进行参数化。我们显示时间模式紧密遵循一阶马尔可夫过程的属性，这导致PCA变换非常接近DCT变换。与使用时间模式进行特征估计相比，使用时间模式的DCT分量可获得更好的识别性能。还对其他线性变换（例如线性判别分析（LDA））进行了参数化研究。参数化的TempoRA1模式（TRAPS）用于在每个关键频带中独立估计宽语音氏族-后验。这些后验组合在一起并用作单词识别的功能。我们的工作表明，广泛的语音特征比其他常规特征具有更好的泛化能力，并且就ASR的短期倒谱特征产生了可观的补充信息。针对广泛的TRAPS功能，提出了两个实际应用：（1）蜂窝电话中的分布式语音识别（DSR），（2）语音活动检测（VAD）槽。这些功能极大地提高了这些应用程序的性能。提出了新的与频带无关的类别，其表示语音信号的频率局部时间模式中的不同语音事件。这些类别是通过使用聚集层次聚类技术对上下文独立电话的平均时间模式进行聚类而获得的。提出了一种通用的临时模式（UTRAPS）系统，用于语音事件类后验估计。将UTRAPS特征与倒谱特征相结合，可在嘈杂条件下显着提高识别性能。最后，这项工作研究了拓宽频率背景对TRAPS功能和ASR的影响。这项研究表明，组合来自多个关键频带的时间模式对于获得更高的识别率很重要。

著录项

作者
Jain, Pratibha.;
展开▼
作者单位

OGI School of Science & Engineering.;

展开▼
授予单位 OGI School of Science & Engineering.;
学科 Computer Science.; Engineering Electronics and Electrical.; Speech Communication.
学位 Ph.D.
年度 2003
页码 107 p.
总页数 107
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;无线电电子学、电信技术;语言学;
关键词

相似文献

外文文献
中文文献
专利

1. Assessing the effect of management changes and environmental features on the spatio- temporal pattern of fire in an African Savanna Fire spatio-temporal pattern [J] . Attorre Fabio, Govender Navashni, Hausmann Anna, Journal for nature conservation . 2015,第期

机译：评估管理变化和环境特征对非洲大草原火灾时空格局中火灾时空格局的影响
2. Temporal feature of BOLD responses varies with temporal patterns of movement. [J] . Tomatsu S, Someya Y, Sung YW, Neuroscience Research: The Official Journal of the Japan Neuroscience Society . 2008,第3期

机译：BOLD响应的时间特征随运动的时间模式而变化。
3. The study of the Bithorax-complex genes in patterning CCAP neurons reveals a temporal control of neuronal differentiation by Abd-B The study of the Bithorax-complex genes in patterning CCAP neurons reveals a temporal control of neuronal differentiation by Abd-B The study of the Bithorax-complex genes in patterning CCAP neurons reveals a temporal control of neuronal differentiation by Abd-B [J] . M. Moris-Sanz, E. Sánchez-Herrero, A. Estacio-Gómez, Biology Open . 2015,第9期

机译：对模式CCAP神经元中的Bithorax复杂基因的研究揭示了Abd-B对神经元分化的时间控制。对模式CCAP神经元中的Bithorax复杂基因的揭示揭示了Abd-B对神经元分化的时间调控。模式CCAP神经元中的Bithorax复杂基因揭示了Abd-B对神经元分化的暂时控制
4. Incremental Learning of Spatial-Temporal Features in Human Motion Patterns with Mixture Model for Planning Motion of a Collaborative Robot in Assembly Lines [C] . Akira KANAZAWA, Jun KINUGAWA, Kazuhiro KOSUGE 2019 International Conference on Robotics and Automation . 2019

机译：用于规划装配线中协作机器人运动的混合模型的增量学习人的运动模式的时空特征
5. Prediction of Crime Patterns using the Spatio-Temporal feature relations [D] . Lattu, Devendra Sunil. 2017

机译：使用时空特征关系预测犯罪模式
6. Stimulus features underlying reduced tremor suppression with temporally patterned deep brain stimulation [O] . Merrill J. Birdno, Alexis M. Kuncel, Alan D. Dorval, -1

机译：刺激特征可减少震颤抑制具有暂时性的深部脑刺激
7. MTAD-TF: Multivariate Time Series Anomaly Detection Using the Combination of Temporal Pattern and Feature Pattern [O] . Q. He, Y. J. Zheng, C.L. Zhang, 2020

机译：mtad-tf：多变量时间序列异常检测使用时间模式和特征模式的组合

Temporal patterns of frequency-localized features in ASR.

摘要

著录项

相似文献

相关主题

期刊订阅