首页> 外国专利> Method, apparatus, and system for building a compact model for large vocabulary continuous speech recognition (LVCSR) system

Method, apparatus, and system for building a compact model for large vocabulary continuous speech recognition (LVCSR) system

机译:用于为大型词汇连续语音识别(LVCSR)系统构建紧凑模型的方法,装置和系统

摘要

According to one aspect of the invention, a method is provided in which a mean vector set and a variance vector set of a set of N Gaussians are divided into multiple mean sub-vector sets and variance sub-vector sets, respectively. Each mean sub-vector set contains a subset of the dimensions of the corresponding mean vector set and each variance sub-vector set contains a subset of the dimensions of the corresponding variance vector set. Each resultant sub-vector set is clustered to build a codebook for the respective sub-vector set using a modified K-means clustering process which dynamically merges and splits clusters based upon the size and average distortion of each cluster during each iteration in the modified K-means clustering process.
机译:根据本发明的一个方面,提供了一种方法,其中将N个高斯集的均值向量集和方差向量集分别划分为多个均值子向量集和方差子向量集。每个均值子向量集包含相应的均值向量集的维度的子集,而每个方差子向量集均包含相应的方差向量集的维度的子集。使用修改后的K均值聚类过程对每个结果子矢量集进行聚类以构建相应子矢量集的码本,该过程根据修改后K中每次迭代过程中每个聚类的大小和平均失真来动态合并和拆分聚类。 -表示聚类过程。

著录项

  • 公开/公告号US7454341B1

    专利类型

  • 公开/公告日2008-11-18

    原文格式PDF

  • 申请/专利权人 JIELIN PAN;BAOSHENG YUAN;

    申请/专利号US20000148028

  • 发明设计人 JIELIN PAN;BAOSHENG YUAN;

    申请日2000-09-30

  • 分类号G10L15/14;

  • 国家 US

  • 入库时间 2022-08-21 19:29:35

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号