首页> 外文会议>Speaker Classification II: Selected Projects; Lecture Notes in Artificial Intelligence; 4441 >Language-Independent Speaker Classification over a Far-Field Microphone
【24h】

Language-Independent Speaker Classification over a Far-Field Microphone

机译:远场麦克风的与语言无关的说话者分类

获取原文
获取原文并翻译 | 示例

摘要

The speaker classification approach described in this contribution leverages the analysis of both speaker and verbal content information, so as to use two light-weight components for classification: a spectral matching component based on a global representation of the entire utterance, and a temporal alignment component based on more conventional frame-level evidence. The paradigm behind the spectral matching component is related to latent semantic mapping, which postulates that the underlying structure in the data is partially obscured by the randomness of local phenomena with respect to information extraction. Uncovering this latent structure results in a parsimonious continuous parameter description of feature frames and spectral bands, which then replaces the original parameterization in clustering and identification. Such global analysis can then be advantageously combined with elementary temporal alignment. This approach has been commercially deployed for the purpose of language-independent desktop voice login over a far-field microphone.
机译:此文稿中描述的说话人分类方法利用了对说话人和言语内容信息的分析,从而使用了两个轻量级的分量进行分类:基于整个话语的全局表示的频谱匹配分量和时间对齐分量基于更常规的帧级证据。频谱匹配组件背后的范例与潜在语义映射有关,后者假定数据中的底层结构被局部现象相对于信息提取的随机性所部分掩盖。发现此潜在结构会导致特征帧和光谱带的简约连续参数描述,然后替换聚类和识别中的原始参数设置。这样的全局分析然后可以有利地与基本时间对准相结合。此方法已被商业部署,用于通过远场麦克风进行与语言无关的桌面语音登录。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号