Deep generative variational autoencoding for replay spoof detection in automatic speaker verification

Bhusan Chettri; Tomi Kinnunen; Emmanouil Benetos

首页> 外文期刊>Computer speech and language >Deep generative variational autoencoding for replay spoof detection in automatic speaker verification

【24h】

Deep generative variational autoencoding for replay spoof detection in automatic speaker verification

机译：自动扬声器验证中重放欺骗检测的深度生成变分自动化

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Automatic speaker verification (ASV) systems are highly vulnerable to presentation attacks, also called spoofing attacks. Replay is among the simplest attacks to mount - yet difficult to detect reliably. The generalization failure of spoofing countermeasures (CMs) has driven the community to study various alternative deep learning CMs. The majority of them are supervised approaches that learn a human-spoof discriminator. In this paper, we advocate a different, deep generative approach that leverages from powerful unsupervised manifold learning in classification. The potential benefits include the possibility to sample new data, and to obtain insights to the latent features of genuine and spoofed speech. To this end, we propose to use variational autoencoders (VAEs) as an alternative backend for replay attack detection, via three alternative models that differ in their class-conditioning. The first one, similar to the use of Gaussian mixture models (GMMs) in spoof detection, is to train independently two VAEs - one for each class. The second one is to train a single conditional model (C-VAE) by injecting a one-hot class label vector to the encoder and decoder networks. Our final proposal integrates an auxiliary classifier to guide the learning of the latent space. Our experimental results using constant-Q cepstral coefficient (CQCC) features on the ASVspoof 2017 and 2019 physical access subtask datasets indicate that the C-VAE offers substantial improvement in comparison to training two separate VAEs for each class. On the 2019 dataset, the C-VAE outperforms the VAE and the baseline GMM by an absolute 9 - 10% in both equal error rate (EER) and tandem detection cost function (t-DCF) metrics. Finally, we propose VAE residuals - the absolute difference of the original input and the reconstruction as features for spoofing detection. The proposed frontend approach augmented with a convolutional neural network classifier demonstrated substantial improvement over the VAE backend use case.

机译：自动扬声器验证（ASV）系统非常容易受到演示攻击的影响，也称为欺骗攻击。重播是最简单的攻击，才能安装 - 但难以可靠地检测。欺骗对策（CMS）的泛化失败使社区研究了各种替代深度学习CMS。其中大多数是监督学习人类恶搞鉴别者的方法。在本文中，我们倡导了不同，深入的生成方法，从而利用了对分类中的强大无人驾驶的歧管学习。潜在的好处包括采样新数据的可能性，并获得对真实和欺骗演讲的潜在特征的见解。为此，我们建议使用变分AualEncoders（VAES）作为重放攻击检测的替代后端，通过三种替代模型，这些模型在其类调节中不同。第一个，类似于使用高斯混合模型（GMMS）在欺骗检测中，是为每个类独立训练两个VAES。第二个是通过向编码器和解码器网络注入一个热级标签向量来训练单个条件模型（C-VAE）。我们的最终提案集成了辅助分类器，以指导潜在空间的学习。我们在ASVSPOOF 2017和2019物理访问子任务数据集上使用恒定Q谱系距（CQCC）特征的实验结果表明C-VAE与每个类的两个单独的VAE训练相比，C-VAE提供了大量的改进。在2019年数据集上，C-VAE在等于错误率（eer）和串联检测成本函数（T-DCF）度量（T-DCF）度量中，通过绝对9 - 10％优于VAE和基线GMM。最后，我们提出了VAE残差 - 原始输入的绝对差异和重建作为欺骗检测的特征。通过卷积神经网络分类器增强的建议的前端方法表现出对VAE后端用例的显着改进。

著录项

来源
《Computer speech and language》 |2020年第9期|101092.1-101092.18|共18页
作者
Bhusan Chettri; Tomi Kinnunen; Emmanouil Benetos;
展开▼
作者单位

School of Computing University of Eastern Finland Joensuu FI-80101 Finland School of EECS Queen Mary University of London United Kingdom;

School of Computing University of Eastern Finland Joensuu FI-80101 Finland;

School of EECS Queen Mary University of London United Kingdom;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Anti-spoofing; Presentation attack detection; Replay attack; Countermeasures; Deep generative models;

机译：反欺骗;演示攻击检测;重播攻击;对策;深度生成模型;

相似文献

外文文献
中文文献
专利

1. An assessment of automatic speaker verification vulnerabilities to replay spoofing attacks [J] . Janicki Artur, Alegre Federico, Evans Nicholas Security and Communications Networks . 2016,第15期

机译：评估自动扬声器验证漏洞以重播欺骗攻击的评估
2. A New Replay Attack Against Automatic Speaker Verification Systems [J] . Yoon Sung-Hyun, Koh Min-Sung, Park Jae-Han, Quality Control, Transactions . 2020,第期

机译：对自动扬声器验证系统进行新的重播攻击
3. Joint Decision of Anti-Spoofing and Automatic Speaker Verification by Multi-Task Learning With Contrastive Loss [J] . Li Jiakang, Sun Meng, Zhang Xiongwei, Quality Control, Transactions . 2020,第期

机译：多任务学习与对比损失的联合决定反欺骗和自动演讲者核查
4. Replay spoofing detection system for automatic speaker verification using multi-task learning of noise classes [C] . Hye-Jin Shim, Sung-Hyun Yoon, Jee-Weon Jung, Conference on Technologies and Applications of Artificial Intelligence . 2018

机译：使用多任务学习噪声类重播欺骗欺骗检测系统
5. Discriminative and generative approaches for long- and short-term speaker characteristics modeling: Application to speaker verification. [D] . Dehak, Najim. 2009

机译：长期和短期说话者特征建模的判别和生成方法：在说话者验证中的应用。
6. Automatic Fetal Middle Sagittal Plane Detection in Ultrasound Using Generative Adversarial Network [O] . Pei-Yin Tsai, Ching-Hui Hung, Chi-Yeh Chen, 2021

机译：使用生成对抗网络自动胎儿中间矢状平面检测超声波
7. A Study on Replay Attack and Anti-Spoofing for Automatic Speaker Verification [O] . Li, Lantian, Chen, Yixiang, Wang, Dong, 2017

机译：自动扬声器的重放攻击和反欺骗研究验证

Deep generative variational autoencoding for replay spoof detection in automatic speaker verification

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅