首页> 外文期刊>IEEE Transactions on Speech and Audio Proceessing >Time-domain isolated phoneme classification using reconstructed phase spaces
【24h】

Time-domain isolated phoneme classification using reconstructed phase spaces

机译:使用重构相空间的时域隔离音素分类

获取原文
获取原文并翻译 | 示例
           

摘要

This paper introduces a novel time-domain approach to modeling and classifying speech phoneme waveforms. The approach is based on statistical models of reconstructed phase spaces, which offer significant theoretical benefits as representations that are known to be topologically equivalent to the state dynamics of the underlying production system. The lag and dimension parameters of the reconstruction process for speech are examined in detail, comparing common estimation heuristics for these parameters with corresponding maximum likelihood recognition accuracy over the TIMIT data set. Overall accuracies are compared with a Mel-frequency cepstral baseline system across five different phonetic classes within TIMIT, and a composite classifier using both cepstral and phase space features is developed. Results indicate that although the accuracy of the phase space approach by itself is still currently below that of baseline cepstral methods, a combined approach is capable of increasing speaker independent phoneme accuracy.
机译:本文介绍了一种新颖的时域方法来对语音音素波形进行建模和分类。该方法基于重构相空间的统计模型,该模型提供了显着的理论收益,作为已知的表示在拓扑上等同于基础生产系统的状态动态的表示形式。详细检查了语音重建过程的滞后和维度参数,将这些参数的通用估计启发式方法与TIMIT数据集上的相应最大似然识别精度进行了比较。将整体精度与TIMIT内五个不同语音分类的Mel频率倒谱基线系统进行比较,并开发了同时使用倒谱和相空间特征的复合分类器。结果表明,尽管相空间方法本身的准确性目前仍低于基线倒谱方法的准确性,但是组合方法能够提高说话者独立音素的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号