首页> 外文会议>Annual Pacific Voice Conference >Spectral and textural features for automatic classification of fricatives
【24h】

Spectral and textural features for automatic classification of fricatives

机译:用于自动分类摩擦分类的光谱和纹理特征

获取原文
获取外文期刊封面目录资料

摘要

Classification of unvoiced fricatives is an important stage in applications such as spoken term detection and audio-video synchronization, and in technologies for the hearing impaired. Due to their acoustic similarity, extraction of multiple features and construction of high-dimensional feature vectors are required for successful classification of these phonemes. In this study two dimensionality reduction algorithms, namely, t-distributed Stochastic Neighbor Embedding (t-SNE) and Sequential Forward Floating Selection (SFFS) were used to obtain a compact representation of the data. A classification stage (kNN or SVM) was then applied, in which we compared the identification rates between the original feature vector and the low-dimensional respresentation. A total of 1000 unvoiced fricatives (/s/ /sh/ /f/ and /th/) derived from the TIMIT speech database, containing 25000 short frames of 8ms each, were used for the evaluation. We show that representing the data by a feature vector with as few as 3 dimensions, yields a classification rate of almost 90% which outperforms most of the results obtained in previous studies.
机译:清音摩擦音的分类是在诸如口语术语检测和音频 - 视频同步,并在技术对于听力受损者的一个重要阶段。由于它们的声学相似度,需要用于这些音素的成功分级的多种特性和高维特征向量的提取结构。在这项研究中2维数降低的算法,即,叔分布式随机邻居嵌入(叔SNE)和顺序前进浮动选区(SFFS)来获得数据的紧凑表示。然后,分类级(k近邻或SVM)施加,其中我们比较了原始特征向量和所述低维respresentation之间的识别率。共有1000和清音摩擦音(/ S / / SH / / F /和/ TH /)从TIMIT语音数据库中导出,其中包含每个8毫秒的25000个的短帧,被用于评价。我们表明,在只有3个维度特征向量表示数据,得到的几乎90%的分类速率优于大部分在以前的研究中获得的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号