首页> 外文会议>IAPR Asian Conference on Pattern Recognition >Investigating the Stacked Phonetic Bottleneck Feature for Speaker Verification with Short Voice Commands

【24h】

Investigating the Stacked Phonetic Bottleneck Feature for Speaker Verification with Short Voice Commands

机译：调查用于语音验证的语音提示的堆叠语音瓶颈功能

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Text-dependent speaker verification (SV) with short voice command (SV-SVC) has increasing demand in many applications. Different from conventional SV, SV-SVC usually uses short fixed voice commands for user-friendly purpose, which causes technical challenges compared with conventional text-dependent SV using fixed phrases (SV-FP). Research results show that the mainstream SV techniques are not able to provide good performance for SV-SVC tasks since they suffer from strongly lexical-overlapping and short utterance length problems. In this paper, we propose to fully explore the acoustic features and contextual information of the phonetic units to obtain better speaker-utterance related information representation for i-vector based SV-SVC systems. Specifically, instead of using MFCC only, the frame-based phonetic bottleneck (PBN) feature extracted from a phonetic bottleneck neural network (PBNN), the stacked phonetic bottleneck (SBN) feature, the cascaded feature of PBN and MFCC, the cascaded feature of SBN and MFCC (SBNF+MFCC) are extracted for developing i-vector based SV-SVC systems. Intensive experiments on the benchmark database RSR2015 have been conducted to evaluate the performance of our proposed ivector SV-SVC systems. It is encouraged that the contextual information learnt from stacked PBNN does help and proposed ivector SV-SVC system with (SBNF+MFCC) outperforms under experimental conditions.

机译：在许多应用中，带有短语音命令（SV-SVC）的文本相关的说话人验证（SV）的需求不断增长。与常规SV不同，SV-SVC通常出于用户友好目的使用简短的固定语音命令，这与使用固定短语（SV-FP）的常规基于文本的SV相比造成了技术挑战。研究结果表明，主流的SV技术不能为SV-SVC任务提供良好的性能，因为它们遭受了严重的词法重叠和简短的发音长度问题。在本文中，我们建议充分探索语音单元的声学特征和上下文信息，以获得更好的基于i-vector的SV-SVC系统的说话者话语相关信息表示。具体来说，不是仅使用MFCC，而是从语音瓶颈神经网络（PBNN）中提取基于帧的语音瓶颈（PBN）功能，堆叠语音瓶颈（SBN）功能，PBN和MFCC的级联功能，提取SBN和MFCC（SBNF + MFCC）以开发基于i-vector的SV-SVC系统。在基准数据库RSR2015上进行了密集的实验，以评估我们提出的ivector SV-SVC系统的性能。令人鼓舞的是，从堆叠式PBNN中学习到的上下文信息确实有帮助，建议的带有（SBNF + MFCC）的ivector SV-SVC系统在实验条件下的表现要好。

著录项

来源
《IAPR Asian Conference on Pattern Recognition》|2017年|706-711|共6页
会议地点 Nanjing(CN)
作者
Yichi Huang; Yuexian Zou; Yi Liu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Feature extraction; Phonetics; Mel frequency cepstral coefficient; Task analysis; Neural networks; Data mining; Benchmark testing;

机译：特征提取;语音；梅尔频率倒谱系数；任务分析；神经网络;数据挖掘;基准测试;

相似文献

外文文献
中文文献
专利

1. Domain compensation based on phonetically discriminative features for speaker verification [J] . Yanhua Long, Hong Ye, Jifeng Ni Computer speech and language . 2017,第jana期

机译：基于语音区分功能的域补偿，用于说话人验证
2. Generalized I-vector Representation with Phonetic Tokenizations and Tandem Features for both Text Independent and Text Dependent Speaker Verification [J] . Li Ming, Liu Lun, Cai Weicheng, Journal of signal processing systems for signal, image, and video technology . 2016,第2期

机译：具有语音分词和串联特性的通用I向量表示，可用于文本无关和文本相关的说话人验证
3. Combination of Cepstral and Phonetically Discriminative Features for Speaker Verification [J] . Sarkar A.K., Do C.-T., Le V.-B., IEEE signal processing letters . 2014,第9期

机译：倒谱和语音辨别功能的组合，用于说话人验证
4. Investigating the Stacked Phonetic Bottleneck Feature for Speaker Verification with Short Voice Commands [C] . Yichi Huang, Yuexian Zou, Yi Liu IAPR Asian Conference on Pattern Recognition . 2017

机译：调查带有短语命令的扬声器验证的堆叠语音瓶颈功能
5. Representational Limitations and Consequences of Phonetic Accommodation: English and Hungarian Speakers’ Imitation of Word-Initial Voiced and Voiceless Stops [D] . Szabo, Ildiko Emese. 2020

机译：语音住宿的代表性限制和后果：英语和匈牙利语演讲者的仿制词初始浊音和无声终止
6. Speaker-Sex Discrimination for Voiced and Whispered Vowels at Short Durations [O] . David R. R. Smith 2016

机译：短时语音和低语元音的说话人性别歧视
7. Phonetic-Attention Scoring for Deep Speaker Features in Speaker Verification [O] . Lantian Li, Zhiyuan Tang, Ying Shi, 2019

机译：扬声器验证中深音扬声器功能的语音关注评分

Investigating the Stacked Phonetic Bottleneck Feature for Speaker Verification with Short Voice Commands

摘要

著录项

相似文献

相关主题

期刊订阅