首页> 外文会议>2017 IEEE Automatic Speech Recognition and Understanding Workshop >Feature optimized DPGMM clustering for unsupervised subword modeling: A contribution to zerospeech 2017

【24h】

Feature optimized DPGMM clustering for unsupervised subword modeling: A contribution to zerospeech 2017

机译：针对无监督子词建模的功能优化的DPGMM聚类：对Zerospeech 2017的贡献

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper describes our unsupervised subword modeling pipeline for the zero resource speech challenge (ZeroSpeech) 2017. Our approach is built around the Dirichlet process Gaussian mixture model (DPGMM) that we use to cluster speech feature vectors into a dynamically sized set of classes. By considering each class an acoustic unit, speech can be represented as sequence of class posteriorgrams. We enhance this method by automatically optimizing the DPGMM sampler's input features in a multi-stage clustering framework, where we unsupervisedly learn transformations using LDA, MLLT and (basis) fMLLR to reduce variance in the features. We show that this optimization considerably boosts the subword modeling quality, according to the performance on the ABX phone discriminability task. For the first time, we apply inferred subword models to previously unseen data from a new set of speakers. We demonstrate our method's good generalization and the effectiveness of its blind speaker adaptation in extensive experiments on a multitude of datasets. Our pipeline has very little need for hyper-parameter adjustment and is entirely unsupervised, i.e., it only takes raw audio recordings as input, without requiring any pre-defined segmentation, explicit speaker IDs or other meta data.

机译：本文介绍了针对零资源语音挑战（ZeroSpeech）2017的无监督子词建模管道。我们的方法是围绕Dirichlet过程高斯混合模型（DPGMM）构建的，该模型用于将语音特征向量聚类为动态大小的类集。通过将每个类别考虑为一个声学单位，语音可以表示为一系列后验图。我们通过在多阶段聚类框架中自动优化DPGMM采样器的输入特征来增强此方法，在该框架中，我们无监督地学习了使用LDA，MLLT和（基本）fMLLR的变换，以减少特征的差异。我们显示，根据ABX电话可分辨性任务上的性能，此优化可以大大提高子词建模质量。第一次，我们将推断的子词模型应用于来自一组新说话者的先前看不见的数据。我们在大量数据集的广泛实验中证明了该方法的良好概括性以及盲人适应性的有效性。我们的管道几乎不需要超参数调整，并且完全不受监督，即仅将原始音频记录作为输入，而无需任何预定义的分段，明确的说话者ID或其他元数据。

著录项

来源
《2017 IEEE Automatic Speech Recognition and Understanding Workshop 》|2017年|740-746|共7页
会议地点 Okinawa(JP)
作者
Michael Heck; Sakriani Sakti; Satoshi Nakamura;
展开▼
作者单位

Graduate School of Information Science, Nara Institute of Science and Technology, Japan;

Graduate School of Information Science, Nara Institute of Science and Technology, Japan;

Graduate School of Information Science, Nara Institute of Science and Technology, Japan;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
Speech; Training; Data models; Hidden Markov models; Acoustics; Signal to noise ratio; Task analysis;

机译：语音;训练;数据模型;隐马尔可夫模型;声学;信噪比;任务分析;;

相似文献

外文文献
中文文献
专利

1. Unsupervised Linear Discriminant Analysis for Supporting DPGMM Clustering in the Zero Resource Scenario [J] . Michael Heck, Sakriani Sakti, Satoshi Nakamura Procedia Computer Science . 2016 ,第1期

机译：零资源场景下支持DPGMM聚类的无监督线性判别分析
2. Dirichlet Process Mixture of Mixtures Model for Unsupervised Subword Modeling [J] . Michael Heck, Sakriani Sakti, Satoshi Nakamura Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2018 ,第11期

机译：无监督子词建模的混合模型Dirichlet过程混合
3. Multilingual and unsupervised subword modeling for zero-resource languages [J] . Enno Hermann, Herman Kamper, Sharon Goldwater Computer speech and language . 2021 ,第Jana期

机译：零资源语言的多语言和无人监督子字建模
4. Feature optimized DPGMM clustering for unsupervised subword modeling: A contribution to zerospeech 2017 [C] . Michael Heck, Sakriani Sakti, Satoshi Nakamura IEEE Workshop on Automatic Speech Recognition and Understanding . 2017

机译：功能优化DPGMM集群，用于无监督的子字建模：对Zerospeech 2017的贡献
5. Deep Temporal Clustering: Fully Unsupervised Learning of Time-domain Features [D] . Madiraju, Naveen Sai. 2018

机译：深度时间聚类：完全无监督的时间域功能学习
6. Correction: Fiorini L. et al. Unsupervised Machine Learning for Developing Personalised Behaviour Models Using Activity Data. Sensors 2017 17 1034 [O] . Laura Fiorini, Filippo Cavallo, Paolo Dario, 2019

机译：校正：FioriniL.等使用活动数据开发个性化行为模型的无监督机器学习。传感器2017171034
7. Unsupervised Linear Discriminant Analysis for Supporting DPGMM Clustering in the Zero Resource Scenario [O] . Heck Michael, Sakti Sakriani, Nakamura Satoshi 2016

机译：零资源场景下支持DPGMM聚类的无监督线性判别分析

Feature optimized DPGMM clustering for unsupervised subword modeling: A contribution to zerospeech 2017

摘要

著录项

相似文献

相关主题

期刊订阅