首页> 外文会议>Conference of the International Speech Communication Association >A Method for Structure Estimation of Weighted Finite-State Transducers and Its Application To Grapheme-to-Phoneme Conversion
【24h】

A Method for Structure Estimation of Weighted Finite-State Transducers and Its Application To Grapheme-to-Phoneme Conversion

机译:一种用于加权有限状态传感器的结构估计方法及其在Rapleme-to-Phopeme转换中的应用

获取原文

摘要

Weighted finite-state transducers (WFSTs) are widely used as a fundamental data structure in several spoken language processing systems since they can provide a unified representation of many types of probabilistic models. Even though the use of accurate WFSTs is important in many spoken language systems, WFSTs are conventionally obtained by transforming probabilistic models that are not estimated in terms of WFST accuracy. Several recent techniques have enabled the direct optimization of weight parameters in WFSTs; however, these techniques do not optimize the structures of WFSTs directly. In this paper, with the goal of achieving a direct estimation of WFST structures from a dataset, we introduce a Bayesian method for structure inference. The proposed method employs the hierarchical Dirichlet process (HDP) as a prior process of generative processes of arcs in the WFSTs. Thanks to the flexibility of the HDP that enables the handling of countably infinite entities, the proposed method can potentially generate the infinite number of arcs in the WFSTs. The efficiency of the proposed method is verified by estimating WFSTs for grapheme-to-phoneme (G2P) conversion. We confirmed that the WFST obtained by the pro-posed method realized a compact representation of G2P conversion compared with the conventional N-gram-based G2P models.
机译:加权有限状态传感器(WFST)广泛用作多种语言处理系统中的基本数据结构,因为它们可以提供许多类型的概率模型的统一表示。即使在许多口语系统中使用精确的WFST是重要的,WFST通常通过转换在WFST精度方面未估计的概率模型来获得。最近的几种技术使WFST中的重量参数直接优化;然而,这些技术不会直接优化WFST的结构。在本文中,目的是实现从数据集实现WFST结构的直接估计,我们引入了一种用于结构推断的贝叶斯方法。该方法采用分层Dirichlet方法(HDP)作为WFST中弧的生成过程的先前过程。由于HDP的灵活性,即支持可选的无限实体,所提出的方法可能会在WFST中产生无限数量的弧。通过估计用于标记 - 对音素(G2P)转换的WFSTS来验证所提出的方法的效率。我们证实,与基于常规的N克的G2P模型相比,通过Pro-adqued方法获得的WFST实现了G2P转换的紧凑表示。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号