WAVEPURIFIER: Purify the Audio Adversarial Examples Using Hierarchical Diffusion Models

We present audio examples for our purification output in three attacks. We compare the result with existing defense methods, and present the transcriptions on ASR (Aotomated Speech Recognition) models. For each audio, we provide the trans criptions under the audio.

[Code]   [PDF]   [Cite]

  • Benign: The benign audio.
  • XX AE: The AE crafted by different audio adversarial attacks.
  • Down-Up 2k: Make a downsampling and upsampling to 2kHz to defend the AE. Proposed by USENIX 2021: WaveGuard
  • LPC 10: Use Linear Predictive Coding with 10th oreder filter to defend the AE. Proposed by USENIX 2021: WaveGuard
  • Quant 8: Quantilize the audio samples in 8 bits and then reconstruct back to defend the AE. Proposed by USENIX 2021: WaveGuard
  • ANR: Use adaptive noise reduction to defend the AE. Proposed by PLoS computational biology
  • SNR: Use stationary noise reduction to defend the AE. Proposed by PLoS computational biology
  • DiffSpec: Use Mel-spectrogram based diffusion model to purify AE. Proposed by ICLR 2023: AudioPure
  • DiffWave: Use waveform based diffusion model to purify AE. Proposed by ICLR 2023: AudioPure
  • Ours: Use our WAVEPURIFIER to defend the AE.

C&W Attack - S&P Workshop 2018

ASR model: DeepSpeech 0.4.1-Model Description

Benign C&W AE Down-Up 2k LPC-10 Quant-8 ANR SNR DiffSpec DiffWave Ours

QIN-I Attack - PMLR 2019

ASR model: LingVo ASR-Model Description

Benign QIN-I AE Down-Up 2k LPC-10 Quant-8 ANR SNR DiffSpec DiffWave Ours

SpecPatch Attack - CCS 2022

ASR model: DeepSpeech 0.4.1-Model Description

Benign SpecPatch AE Down-Up 2k LPC-10 Quant-8 ANR SNR DiffSpec DiffWave Ours