首页> 外文期刊>Audio, Speech, and Language Processing, IEEE/ACM Transactions on >Dirichlet Process Mixture of Mixtures Model for Unsupervised Subword Modeling
【24h】

Dirichlet Process Mixture of Mixtures Model for Unsupervised Subword Modeling

机译:无监督子词建模的混合模型Dirichlet过程混合

获取原文
获取原文并翻译 | 示例

摘要

We develop a parallelizable Markov chain Monte Carlo sampler for a Dirichlet process mixture of mixtures model. Our sampler jointly infers a codebook and clusters. The codebook is a global collection of components. Clusters are mixtures, defined over the codebook. We combine a nonergodic Gibbs sampler with two layers of split and merge samplers on codebook and mixture level to form a valid ergodic chain. We design an additional switch sampler for components that supports convergence in our experimental results. In the use case of unsupervised subword modeling, we show that our method infers complex classes from real speech feature vectors that consistently show higher quality on several evaluation metrics. At the same time, we infer fewer classes that represent subword units more consistently and show longer durations, compared to a standard Dirichlet process mixture model sampler.
机译:我们为Dirichlet过程混合物混合物模型开发了可并行化的马尔可夫链蒙特卡洛采样器。我们的采样器共同推断出一个码本和类。该代码本是组件的全局集合。群集是混合的,在代码簿上定义。我们将非遍历的Gibbs采样器与代码簿和混合级别的两层拆分和合并采样器结合在一起,以形成有效的遍历链。我们为组件设计了一个额外的开关采样器,以支持我们实验结果的收敛。在无监督子词建模的用例中,我们证明了我们的方法从真实的语音特征向量推断出复杂的类,这些向量在多个评估指标上始终显示出更高的质量。同时,与标准Dirichlet过程混合模型采样器相比,我们推断出更少的类来更一致地表示子词单元并显示更长的持续时间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号