首页> 外文期刊>Selected Topics in Signal Processing, IEEE Journal of >Postprocessing Synthetic Speech With a Complex Cepstrum Vocoder for Spoofing Phase-Based Synthetic Speech Detectors
【24h】

Postprocessing Synthetic Speech With a Complex Cepstrum Vocoder for Spoofing Phase-Based Synthetic Speech Detectors

机译:使用复杂倒谱声码器对合成语音进行后处理,以欺骗基于相位的合成语音检测器

获取原文
获取原文并翻译 | 示例
       

摘要

State-of-the-art speaker verification systems are vulnerable to spoofing attacks. To address the issue, high-performance synthetic speech detectors (SSDs) for existing spoofing methods have been proposed. Phase-based SSDs that exploit the fact that most of the parametric speech coders use minimum-phase filters are particularly successful when synthetic speech is generated with a parametric vocoder. Here, we propose a new attack strategy to spoof phase-based SSDs with the objective of increasing the security of voice verification systems by enabling the development of more generalized SSDs. As opposed to other parametric vocoders, the complex cepstrum approach uses mixed-phase filters, which makes it an ideal candidate for spoofing the phase-based SSDs. We propose using a complex cepstrum vocoder as a postprocessor to existing techniques to spoof the speaker verification system as well as the phase-based SSDs. Once synthetic speech is generated with a speech synthesis or a voice conversion technique, for each synthetic speech frame, a natural frame is selected from a training database using a spectral distance measure. Then, complex cepstrum parameters of the natural frame are used for resynthesizing the synthetic frame. In the proposed method, complex cepstrum-based resynthesis is used as a postprocessor. Hence, it can be used in tandem with any synthetic speech generator. Experimental results showed that the approach is successful at spoofing four phase-based SSDs across nine parametric attack algorithms. Moreover, performance at spoofing the speaker verification system did not substantially degrade compared to the case when no postprocessor is employed.
机译:最先进的扬声器验证系统容易受到欺骗攻击。为了解决该问题,已经提出了用于现有欺骗方法的高性能合成语音检测器(SSD)。当使用参数声码器生成合成语音时,利用大多数参数语音编码器使用最小相位滤波器这一事实的基于相位的SSD特别成功。在这里,我们提出一种新的攻击策略来欺骗基于阶段的SSD,以通过开发更通用的SSD来提高语音验证系统的安全性为目标。与其他参量声码器相反,复杂倒谱方法使用混合相位滤波器,这使其成为欺骗基于相位的SSD的理想选择。我们建议使用复杂的倒谱声码器作为现有技术的后处理器,以欺骗说话者验证系统以及基于相位的SSD。一旦使用语音合成或语音转换技术生成了合成语音,对于每个合成语音帧,便使用频谱距离度量从训练数据库中选择自然帧。然后,使用自然帧的复杂倒谱参数来重新合成合成帧。在提出的方法中,基于复杂倒谱的再合成被用作后处理器。因此,它可以与任何合成语音生成器一起使用。实验结果表明,该方法成功欺骗了九个参数攻击算法中的四个基于相位的SSD。而且,与不采用后处理器的情况相比,欺骗说话者验证系统的性能没有实质性降低。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号