首页> 外国专利> Method and apparatus for speaker identification using mixture discriminant analysis to develop speaker models

Method and apparatus for speaker identification using mixture discriminant analysis to develop speaker models

机译:使用混合判别分析开发说话人模型的说话人识别方法和装置

摘要

A speaker identification system is provided that constructs speaker models using a discriminant analysis technique where the data in each class is modeled by Gaussian mixtures. The speaker identification method and apparatus determines the identity of a speaker, as one of a small group, based on a sentence-length password utterance. A speaker's utterance is received and a sequence of a first set of feature vectors are computed based on the received utterance. The first set of feature vectors are then transformed into a second set of feature vectors using transformations specific to a particular segmentation unit, and likelihood scores of the second set of feature vectors are computed using speaker models trained using mixture discriminant analysis. The likelihood scores are then combined to determine an utterance score and the speaker's identity is validated based on the utterance score. The speaker identification method and apparatus also includes training and enrollment phases. In the enrollment phase the speaker's password utterance is received multiple times. A transcription of the password utterance as a sequence of phones is obtained, and the phone string is stored in a database containing phone strings of other speakers in the group. In the training phase, the first set of feature vectors are extracted from each password utterance and the phone boundaries for each phone in the password transcription are obtained using a speaker independent phone recognizer. A mixture model is developed for each phone of a given speaker's password. Then, using the feature vectors from the password utterances of all of the speakers in the group, transformation parameters and transformed models are generated for each phone and speaker, using mixture discriminant analysis.
机译:提供了一种说话人识别系统,该系统使用判别分析技术构造说话人模型,其中每个类别中的数据都通过高斯混合模型进行建模。说话者识别方法和装置基于句子长度的密码话语将说话者的身份确定为一小群。接收说话者的话语,并且基于所接收的话语来计算第一组特征向量的序列。然后使用特定于特定分割单元的转换将第一组特征向量转换为第二组特征向量,并使用使用混合判别分析训练的说话者模型来计算第二组特征向量的似然分数。然后,将似然分数组合以确定话语分数,并基于该话语分数验证说话者的身份。说话者识别方法和装置还包括训练和注册阶段。在注册阶段,会多次收到发言人的密码。获得了作为电话序列的密码发声的转录,并将电话字符串存储在包含该组中其他扬声器的电话字符串的数据库中。在训练阶段,从每个密码发声中提取第一组特征向量,并使用独立于说话者的电话识别器获得密码转录中每个电话的电话边界。为每个电话使用给定扬声器密码的混合模型。然后,使用来自组中所有说话者密码的特征向量,使用混合判别分析为每个电话和说话者生成转换参数和转换模型。

著录项

  • 公开/公告号US6330536B1

    专利类型

  • 公开/公告日2001-12-11

    原文格式PDF

  • 申请/专利权人 AT&T CORP.;

    申请/专利号US20010809226

  • 申请日2001-03-16

  • 分类号G10L170/00;

  • 国家 US

  • 入库时间 2022-08-22 00:48:26

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号