Language-Independent Speaker Classification over a Far-Field Microphone

机译：远场麦克风的与语言无关的说话者分类

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

The speaker classification approach described in this contribution leverages the analysis of both speaker and verbal content information, so as to use two light-weight components for classification: a spectral matching component based on a global representation of the entire utterance, and a temporal alignment component based on more conventional frame-level evidence. The paradigm behind the spectral matching component is related to latent semantic mapping, which postulates that the underlying structure in the data is partially obscured by the randomness of local phenomena with respect to information extraction. Uncovering this latent structure results in a parsimonious continuous parameter description of feature frames and spectral bands, which then replaces the original parameterization in clustering and identification. Such global analysis can then be advantageously combined with elementary temporal alignment. This approach has been commercially deployed for the purpose of language-independent desktop voice login over a far-field microphone.

机译：此文稿中描述的说话人分类方法利用了对说话人和言语内容信息的分析，从而使用了两个轻量级的分量进行分类：基于整个话语的全局表示的频谱匹配分量和时间对齐分量基于更常规的帧级证据。频谱匹配组件背后的范例与潜在语义映射有关，后者假定数据中的底层结构被局部现象相对于信息提取的随机性所部分掩盖。发现此潜在结构会导致特征帧和光谱带的简约连续参数描述，然后替换聚类和识别中的原始参数设置。这样的全局分析然后可以有利地与基本时间对准相结合。此方法已被商业部署，用于通过远场麦克风进行与语言无关的桌面语音登录。

著录项

来源
《Speaker Classification II: Selected Projects; Lecture Notes in Artificial Intelligence; 4441》||P.104-115|共12页
会议地点
作者
Jerome R. Bellegarda;
展开▼
作者单位

Apple Inc., Two Infinite Loop, Cupertino, California 95014, USA;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类程序设计、软件工程;
关键词
spectral matching; global representation; latent structure; distance metric; desktop voice login;

机译：频谱匹配;全局表示;潜在结构;距离度量;桌面语音登录;

相似文献

外文文献
中文文献
专利

1. A Speaker-Dependent Approach to Separation of Far-Field Multi-Talker Microphone Array Speech for Front-End Processing in the CHiME-5 Challenge [J] . Sun Lei, Du Jun, Gao Tian, Selected Topics in Signal Processing, IEEE Journal of . 2019,第4期

机译：CHiME-5挑战中用于前端处理的远场多方麦克风阵列语音分离的扬声器相关方法
2. Microphone Array Processing for Distant Speech Recognition: From Close-Talking Microphones to Far-Field Sensors [J] . Kumatani K., Mcdonough J., Raj B. Signal Processing Magazine, IEEE . 2012,第6期

机译：远距离语音识别的麦克风阵列处理：从近距离麦克风到远场传感器
3. Exploiting joint sparsity for far-field microphone array sound source localization [J] . Zheng Siyuan, Tong F., Huang Huixiang, Applied Acoustics . 2020,第Feba期

机译：利用联合稀疏对远场麦克风阵列声源定位的稀疏性
4. Language-independent, short-enrollment voice verification over a far-field microphone [C] . Bellegarda, J.R., Naik, . 2001

机译：与语言无关的，通过远场麦克风进行的短期入学语音验证
5. Word Segmentation for Japanese and English Speakers: Language-Independent and Language-Dependent Cues [D] . Uehara, Sayako. 2019

机译：日语和英语说话者的分词：独立于语言和独立于语言的提示
6. One-against-All Weighted Dynamic Time Warping for Language-Independent and Speaker-Dependent Speech Recognition in Adverse Conditions [O] . Xianglilan Zhang, Jiping Sun, Zhigang Luo 2010

机译：不利条件下与语言无关和与说话者相关的语音识别的一对多加权动态时间规整
7. Classification of acoustic maps to determine speaker position and orientation from a distributed microphone network [O] . Alessio Brutti, Maurizio Omologo, Piergiorgio Svaizer, 2013

机译：声学地图的分类以确定来自分布式麦克风网络的扬声器位置和方向
8. Modelling Speaker Variability and Imposing Speaker Constraints in Phonetic Classification. [R] . Niyogi, P. 1992

机译：在语音分类中建立说话人可变性和强制说话人约束。

Language-Independent Speaker Classification over a Far-Field Microphone

摘要

著录项

相似文献

相关主题

期刊订阅