An Investigation of Deep-Learning Frameworks for Speaker Verification Antispoofing

Chunlei Zhang; Chengzhu Yu; John H. L. Hansen

首页> 外文期刊>Selected Topics in Signal Processing, IEEE Journal of >An Investigation of Deep-Learning Frameworks for Speaker Verification Antispoofing

【24h】

An Investigation of Deep-Learning Frameworks for Speaker Verification Antispoofing

机译：说话人验证反欺骗的深度学习框架研究

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this study, we explore the use of deep-learning approaches for spoofing detection in speaker verification. Most spoofing detection systems that have achieved recent success employ hand-craft features with specific spoofing prior knowledge, which may limit the feasibility to unseen spoofing attacks. We aim to investigate the genuine-spoofing discriminative ability from the back-end stage, utilizing recent advancements in deep-learning research. In this paper, alternative network architectures are exploited to target spoofed speech. Based on this analysis, a novel spoofing detection system, which simultaneously employs convolutional neural networks (CNNs) and recurrent neural networks (RNNs) is proposed. In this framework, CNN is treated as a convolutional feature extractor applied on the speech input. On top of the CNN processed output, recurrent networks are employed to capture long-term dependencies across the time domain. Novel features including Teager energy operator critical band autocorrelation envelope, perceptual minimum variance distortionless response, and a more general spectrogram are also investigated as inputs to our proposed deep-learning frameworks. Experiments using the ASVspoof 2015 Corpus show that the integrated CNN–RNN framework achieves state-of-the-art single-system performance. The addition of score-level fusion further improves system robustness. A detailed analysis shows that our proposed approach can potentially compensate for the issue due to short duration test utterances, which is also an issue in the evaluation corpus.

机译：在这项研究中，我们探索了在说话者验证中使用深度学习方法进行欺骗检测。大多数最近获得成功的欺骗检测系统都采用具有特定欺骗先验知识的手工功能，这可能会限制看不见欺骗攻击的可行性。我们的目的是利用深度学习研究的最新进展，从后端阶段研究真正的欺骗性判别能力。在本文中，利用替代网络体系结构来针对欺骗性语音。在此基础上，提出了一种新颖的欺骗检测系统，该系统同时采用卷积神经网络（CNN）和递归神经网络（RNN）。在此框架中，CNN被视为应用于语音输入的卷积特征提取器。除了CNN处理的输出外，还使用循环网络来捕获整个时域的长期依赖关系。还研究了包括Teager能量算子临界带自相关包络，感知最小方差无失真响应和更通用的频谱图等新功能，作为我们提议的深度学习框架的输入。使用ASVspoof 2015语料库进行的实验表明，集成的CNN–RNN框架可实现最先进的单系统性能。得分级别融合的添加进一步提高了系统的鲁棒性。详细的分析表明，我们提出的方法可以弥补由于短时测试发声而引起的问题，这也是评估语料库中的一个问题。

著录项

来源
《Selected Topics in Signal Processing, IEEE Journal of》 |2017年第4期|684-694|共11页
作者
Chunlei Zhang; Chengzhu Yu; John H. L. Hansen;
展开▼
作者单位

Center for Robust Speech Systems, Erik Jonsson School of Engineering and Computer Science, University of Texas at Dallas, Richardson, TX, USA;

Center for Robust Speech Systems, Erik Jonsson School of Engineering and Computer Science, University of Texas at Dallas, Richardson, TX, USA;

Center for Robust Speech Systems, Erik Jonsson School of Engineering and Computer Science, University of Texas at Dallas, Richardson, TX, USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Speech; Feature extraction; Machine learning; Recurrent neural networks; Context; Spectrogram; Robustness;

机译：语音;特征提取;机器学习;递归神经网络;上下文;声谱图;鲁棒性;

相似文献

外文文献
中文文献
专利

1. Front-End for Antispoofing Countermeasures in Speaker Verification: Scattering Spectral Decomposition [J] . Kaavya Sriskandaraja, Vidhyasaharan Sethu, Eliathamby Ambikairajah, Selected Topics in Signal Processing, IEEE Journal of . 2017,第4期

机译：说话人验证中反欺骗对策的前端：散射频谱分解
2. Supervector-based approaches in a discriminative framework for speaker verification in noisy environments [J] . Sourjya Sarkar, K. Sreenivasa Rao International journal of speech technology . 2017,第2期

机译：区分框架中基于Supervector的方法，用于嘈杂环境中的说话人验证
3. Enhanced speaker verification using an adaptive multiple low-rank representation based on the modified adaptive Gaussian mixture model framework [J] . Tan Dat Trinh, Ma Xinjie, Kim Jin Young, Cluster computing . 2017,第3期

机译：使用基于修改的自适应高斯混合模型框架的自适应多低级表示增强了扬声器验证
4. Investigating the use of modulation spectral features within an i-vector framework for far-field automatic speaker verification [C] . Avila Anderson R., Fraga Francisco J., Sarria-Paja Milton, 2014 International Telecommunications Symposium . 2014

机译：研究在i矢量框架内使用调制频谱特征进行远场自动扬声器验证
5. Discriminative and generative approaches for long- and short-term speaker characteristics modeling: Application to speaker verification. [D] . Dehak, Najim. 2009

机译：长期和短期说话者特征建模的判别和生成方法：在说话者验证中的应用。
6. Native speakers like affixes L2 speakers like letters? An overt visual priming study investigating the role of orthography in L2 morphological processing [O] . Laura Anna Ciaccio, Gunnar Jacob 2019

机译：母语扬声器喜欢附件L2扬声器喜欢字母吗？一种明显的视觉灌注研究调查拼影在L2形态加工中的作用
7. Investigating the use of Modulation Spectral Features within an I-vector Framework for Far-Field Automatic Speaker Verification [O] . Anderson R. Avila, Francisco J. Fraga, Milton Sarria-paja, 2015

机译：研究在I-vector框架内使用调制光谱特征进行远场自动说话人验证
8. Tests Results Advanced Development Models of BISS Identity Verification Equipment. Volume II. Automatic Speaker Verification. [R] . foodman,martin j. 1978

机译：测试结果BIss身份验证设备的高级开发模型。第二卷。自动扬声器验证。

An Investigation of Deep-Learning Frameworks for Speaker Verification Antispoofing

摘要

著录项

相似文献

相关主题

期刊订阅