Securing Voice-Driven Interfaces Against Fake (Cloned) Audio Attacks

机译：保护语音驱动接口以防伪（克隆）音频攻击

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Voice cloning technologies have found applications in a variety of areas ranging from personalized speech interfaces to advertisement, robotics, and so on. Existing voice cloning systems are capable of learning speaker characteristics and use trained models to synthesize a person's voice from only a few audio samples. Advances in cloned speech generation technologies are capable of generating perceptually indistinguishable speech from a bona-fide speech. These advances pose new security and privacy threats to voice-driven interfaces and speech-based access control systems. The state-of-the-art speech synthesis technologies use trained or tuned generative models for cloned speech generation. Trained generative models rely on linear operations, learned weights, and excitation source for cloned speech synthesis. These systems leave characteristic artifacts in the synthesized speech. Higher-order spectral analysis is used to capture differentiating attributes between bona-fide and cloned audios. Specifically, quadrature phase coupling (QPC) in the estimated bicoherence, Gaussianity test statistics, and linearity test statistics are used to capture generative model artifacts. Performance of the proposed method is evaluated on cloned audios generated using speaker adaptation-and speaker encoding-based approaches. Experimental results for a dataset consisting of 126 cloned speech and 8 bona-fide speech samples indicate that the proposed method is capable of detecting bona-fide and cloned audios with close to a perfect detection rate.

机译：语音克隆技术已在从个性化语音界面到广告，机器人等的各个领域中找到了应用。现有的语音克隆系统能够学习说话者的特征，并使用经过训练的模型仅从少数音频样本中合成人的语音。克隆语音生成技术的进步能够从真正的语音中生成在听觉上无法区分的语音。这些进步对语音驱动的界面和基于语音的访问控制系统构成了新的安全和隐私威胁。最新的语音合成技术使用经过训练或调整的生成模型来克隆语音生成。经过训练的生成模型依赖于线性运算，学习的权重和激发源来进行克隆语音合成。这些系统在合成语音中留下特征伪像。高阶频谱分析用于捕获真实音频和克隆音频之间的区别属性。具体而言，估计双相干性，高斯测试统计量和线性测试统计量中的正交相位耦合（QPC）用于捕获生成的模型伪像。在使用扬声器自适应和基于扬声器编码的方法生成的克隆音频上，评估了所提出方法的性能。由126个克隆语音和8个真实语音样本组成的数据集的实验结果表明，该方法能够以接近完美的检测率检测真实和克隆的音频。

著录项

来源
《Conference on Multimedia Information Processing and Retrieval》|2019年|512-517|共6页
会议地点
作者
Hafiz Malik;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Adaptation models; Speech synthesis; Linearity; Cloning; Spectral analysis; Distortion;

机译：自适应模型语音合成线性克隆光谱分析失真;

相似文献

外文文献
中文文献
专利

1. Multiparty Simultaneous Quantum Identity Authentication Secure against Fake Signal Attacks [J] . Atsushi WASEDA IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences . 2013,第1期

机译：多方同时量子身份认证可防止假信号攻击
2. Improving the Security of Controlled Quantum Secure Direct Communication by Using Four Particle Cluster States Against an Attack with Fake Entangled Particles [J] . Yu-Guang Yang, Hai-Ping Chai, Yi-Wei Teng, International Journal of Theoretical Physics . 2011,第2期

机译：通过使用四个粒子簇状态抵御伪粒子纠缠的攻击来提高受控量子安全直接通信的安全性
3. Improving the Security of Controlled Quantum Secure Direct Communication by Using Four Particle Cluster States Against an Attack with Fake Entangled Particles [J] . Yang Y.-G., Chai H.-P., Teng Y.-W., International Journal of Theoretical Physics: A Journal of Original Research and Reviews in Theoretical Physics and Related Mathematics, Dedicated to the Unification of Physics . 2011,第2期

机译：通过使用四个粒子簇状态抵御伪粒子纠缠的攻击来提高受控量子安全直接通信的安全性
4. Securing Voice-Driven Interfaces Against Fake (Cloned) Audio Attacks [C] . Hafiz Malik IEEE Conference on Multimedia Information Processing and Retrieval . 2019

机译：保护语音驱动的界面免受假（克隆）的音频攻击
5. The Strategy of Fake News: A Polemic on Lies, the Attack on the Truth and the Mainstream Media's Response [D] . Lewis, Joshua R. 2019

机译：假新闻策略：对谎言，对真理的攻击和主流媒体的回应
6. On a Key-Based Secured Audio Data-Hiding Scheme Robust to Volumetric Attack with Entropy-Based Embedding [O] . Jose Juan Garcia-Hernandez 2019

机译：在基于钥匙的安全音频数据隐藏方案上以基于熵的嵌入攻击的体积攻击强大
7. On a Key-Based Secured Audio Data-Hiding Scheme Robust to Volumetric Attack with Entropy-Based Embedding [O] . Jose Juan Garcia-Hernandez 2019

机译：在基于钥匙的安全音频数据隐藏方案上，以基于熵的嵌入攻击的体积攻击强大
8. Design of Voice-Driven Interfaces. [R] . Rudnicky, A. I. 1989

机译：语音驱动接口的设计。

Securing Voice-Driven Interfaces Against Fake (Cloned) Audio Attacks

摘要

著录项

相似文献

相关主题

期刊订阅