MULTI-VIEW CCA-BASED ACOUSTIC FEATURES FOR PHONETIC RECOGNITION ACROSS SPEAKERS AND DOMAINS

机译：基于CCA的基于CCA的声学功能，用于讲话者和域的语音识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Canonical correlation analysis (CCA) and kernel CCA can be used for unsupervised learning of acoustic features when a second view (e.g., articulatory measurements) is available for some training data, and such projections have been used to improve phonetic frame classification. Here we study the behavior of CCA-based acoustic features on the task of phonetic recognition, and investigate to what extent they are speaker-independent or domain-independent. The acoustic features are learned using data drawn from the University of Wisconsin X-ray Microbeam Database (XRMB). The features are evaluated within and across speakers on XRMB data, as well as on out-of-domain TIMIT and MOCHA-TIMIT data. Experimental results show consistent improvement with the learned acoustic features over baseline MFCCs and PCA projections. In both speaker-dependent and cross-speaker experiments, phonetic error rates are improved by 4-9% absolute (10-23% relative) using CCA-based features over baseline MFCCs. In cross-domain phonetic recognition (training on XRMB and testing on MOCHA or TIMIT), the learned projections provide smaller improvements.

机译：当第二次视图（例如，铰接性测量）可用于某些训练数据时，规范相关性分析（CCA）和核CCA可用于对声学特征的无监督学习，并且已经用于改善语音帧分类的这种预测。在这里，我们研究了CCA的声学特征对语音识别任务的行为，并调查了他们独立于扬声器或域独立的程度。使用从威斯康星大学X射线Microbeam数据库（XRMB）绘制的数据学习了声学功能。这些功能在XRMB数据上的扬声器内部和跨扬声器中进行评估，以及域外跨域的跨域和Mocha-Timit数据。实验结果表明，通过基线MFCCS和PCA投影的学习声学特征表明了一致的改进。在扬声器相关和跨扬声器实验中，使用基于CCA的特征在基线MFCC上使用基于CCA的特征来提高4-9％的绝对（10-23％相对）的语音误差率。在跨域语音识别（在XRMB训练和Mocha或Timit测试）中，学习的预测提供了更小的改进。

著录项

来源
《IEEE International Conference on Acoustics, Speech, and Signal Processing》|2013年||共5页
会议地点
作者
Raman Arora; Karen Livescu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TN912-53;
关键词

相似文献

外文文献
中文文献
专利

1. An acoustic-phonetic-based speaker adaptation technique forimproving speaker-independent continuous speech recognition [J] . Yunxin Zhao IEEE Transactions on Speech and Audio Proceessing . 1994,第3期

机译：基于声学的说话人自适应技术，用于改善与说话人无关的连续语音识别
2. An acoustic-phonetic-based speaker adaptation technique for improving speaker-independent continuous speech recognition [J] . Yunxin Zhao IEEE Transactions on Speech and Audio Proceeding . 1994,第3期

机译：基于声学的说话人自适应技术，用于改善与说话人无关的连续语音识别
3. Domain compensation based on phonetically discriminative features for speaker verification [J] . Yanhua Long, Hong Ye, Jifeng Ni Computer speech and language . 2017,第jana期

机译：基于语音区分功能的域补偿，用于说话人验证
4. Multi-view CCA-based acoustic features for phonetic recognition across speakers and domains [C] . Arora Raman, Livescu Karen IEEE International Conference on Acoustics, Speech and Signal Processing . 2013

机译：基于多视图CCA的声学功能，可跨说话者和域进行语音识别
5. Speech recognition based on phonetic features and acoustic landmarks. [D] . Juneja, Amit. 2004

机译：基于语音特征和声学界标的语音识别。
6. Acoustic-phonetic representations in word recognition [O] . DAVID B. PISONI, PAUL A. LUCE -1

机译：单词识别中的语音表示
7. Multi-view CCA-based acoustic features for phonetic recognition across speakers and domains [O] . Raman Arora, Karen Livescu 2013

机译：基于CCa的多视图声学特征，用于跨扬声器和域的语音识别
8. Integrated Feature Normalization and Enhancement for Robust Speaker Recognition Using Acoustic Factor Analysis (Preprint). [R] . Hasan, T., Hansen, J. H. 2012

机译：使用声学因子分析（预印本）进行稳健的说话人识别的集成特征归一化和增强。

MULTI-VIEW CCA-BASED ACOUSTIC FEATURES FOR PHONETIC RECOGNITION ACROSS SPEAKERS AND DOMAINS

摘要

著录项

相似文献

相关主题

期刊订阅