首页> 外文期刊>Computer speech and language >Self-learning speaker identification for enhanced speech recognition
【24h】

Self-learning speaker identification for enhanced speech recognition

机译:自学说话人识别可增强语音识别能力

获取原文
获取原文并翻译 | 示例

摘要

A novel approach for joint speaker identification and speech recognition is presented in this article. Unsupervised speaker tracking and automatic adaptation of the human-computer interface is achieved by the interaction of speaker identification, speech recognition and speaker adaptation for a limited number of recurring users. Together with a technique for efficient information retrieval a compact modeling of speech and speaker characteristics is presented. Applying speaker specific profiles allows speech recognition to take individual speech characteristics into consideration to achieve higher recognition rates. Speaker profiles are initialized and continuously adapted by a balanced strategy of short-term and long-term speaker adaptation combined with robust speaker identification. Different users can be tracked by the resulting self-learning speech controlled system. Only a very short enrollment of each speaker is required. Subsequent utterances are used for unsupervised adaptation resulting in continuously improved speech recognition rates. Additionally, the detection of unknown speakers is examined under the objective to avoid the requirement to train new speaker profiles explicitly. The speech controlled system presented here is suitable for in-car applications, e.g. speech controlled navigation, hands-free telephony or infotainment systems, on embedded devices. Results are presented for a subset of the SPEECON database. The results validate the benefit of the speaker adaptation scheme and the unified modeling in terms of speaker identification and speech recognition rates.
机译:本文提出了一种新颖的联合说话人识别和语音识别方法。通过针对少数重复用户的说话人识别,语音识别和说话人自适应的交互,实现了无监督的说话人跟踪和人机界面的自动调整。结合有效的信息检索技术,提出了语音和说话者特征的紧凑模型。应用说话者特定的配置文件可以使语音识别考虑到各个语音特征,以实现更高的识别率。通过平衡短期和长期说话者适应性策略以及强大的说话者识别能力,初始化并连续调整说话者特征。最终的自学语音控制系统可以跟踪不同的用户。每个演讲者只需要很短的注册时间。随后的话语用于无监督的适应,从而导致语音识别率不断提高。另外,在该目标下检查未知讲话者的检测,以避免需要明确训练新的讲话者简档。此处介绍的语音控制系统适用于车载应用,例如嵌入式设备上的语音控制导航,免提电话或信息娱乐系统。显示了SPEECON数据库的子集的结果。结果验证了说话人自适应方案和统一模型在说话人识别和语音识别率方面的好处。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号