首页> 外文会议>International Conference on Spoken Language Processing; 20041004-08; Jeju(KR) >Design of Ready-Made Acoustic Model Library by Two-Dimensional Visualization of Acoustic Space
【24h】

Design of Ready-Made Acoustic Model Library by Two-Dimensional Visualization of Acoustic Space

机译:通过声学空间的二维可视化设计现成的声学模型库

获取原文
获取原文并翻译 | 示例

摘要

This paper proposes the technique enabling a design of ready-made library composed of high performance and small size acoustic models utilizing the method of visualizing multiple HMM acoustic models onto two-dimensional space ("COSMOS" method: aCOustic Space Map Of Sound), and providing one of these models without overburdening users. The acoustic space (as expressed in multi-dimensional future parameters) is partitioned into zones on two-dimensional space, allowing for the creation of highly precise acoustic models through the generation of acoustic models for respective zones of the acoustic space. A set of these acoustic models is called an acoustic model library. In an experiment of this paper, a plotted map (called the COSMOS map) featuring a total of 145 male speakers speaking in various styles was generated utilizing the COSMOS method. Through the COSMOS map, the distribution of each speaking styles and the relationship between the positioning of the speaker on the COSMOS map and the speech-recognition performance were analyzed, thereby demonstrating the effectiveness of the COSMOS method in the analysis of acoustic space. The COSMOS map was then partitioned into concentric acoustic space zones to produce acoustic models representing each acoustic space zones. By selecting the acoustic model providing maximum likelihood score effectively using voice samples consisting of 5 words, the acoustic model, even if expressed in single Gaussian distribution, showed high performance comparable to speaker-independent acoustic model (called Si-model) expressed in 16 mixture Gaussian distributions. Furthermore, the acoustic model showed performance higher than Si-model adapted with voice samples of 30 words by the MLLR method.
机译:本文提出了一种技术,该技术可以利用将多个HMM声学模型可视化到二维空间的方法(“ COSMOS”方法:声学声学空间图)来设计由高性能和小尺寸声学模型组成的现成库,并且提供这些模型之一而不会给用户带来负担。声学空间(用多维未来参数表示)被划分为二维空间上的区域,从而允许通过为声学空间的各个区域生成声学模型来创建高精度的声学模型。这些声学模型的集合称为声学模型库。在本文的实验中,使用COSMOS方法生成了一个绘图地图(称为COSMOS地图),该地图总共包含145位以各种样式讲话的男性讲话者。通过COSMOS图,分析了每种说话风格的分布以及说话者在COSMOS图上的位置与语音识别性能之间的关系,从而证明了COSMOS方法在声学空间分析中的有效性。然后将COSMOS图划分为同心的声学空间区域,以生成表示每个声学空间区域的声学模型。通过使用由5个单词组成的语音样本选择有效地提供最大似然分数的声学模型,即使以单一高斯分布表示,该声学模型也具有与以16种混合物表示的独立于说话人的声学模型(称为Si模型)相当的高性能高斯分布。此外,通过MLLR方法,声学模型显示出比采用30个单词的语音样本的Si模型更高的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号