WAVEPURIFIER: Purify the Audio Adversarial Examples Using Hierarchical Diffusion Models

We present audio examples for our purification output in three attacks. We compare the result with existing defense methods, and present the transcriptions on ASR (Aotomated Speech Recognition) models. For each audio, we provide the trans criptions under the audio.

Benign: The benign audio.
XX AE: The AE crafted by different audio adversarial attacks.
Down-Up 2k: Make a downsampling and upsampling to 2kHz to defend the AE. Proposed by USENIX 2021: WaveGuard
LPC 10: Use Linear Predictive Coding with 10th oreder filter to defend the AE. Proposed by USENIX 2021: WaveGuard
Quant 8: Quantilize the audio samples in 8 bits and then reconstruct back to defend the AE. Proposed by USENIX 2021: WaveGuard
ANR: Use adaptive noise reduction to defend the AE. Proposed by PLoS computational biology
SNR: Use stationary noise reduction to defend the AE. Proposed by PLoS computational biology
DiffSpec: Use Mel-spectrogram based diffusion model to purify AE. Proposed by ICLR 2023: AudioPure
DiffWave: Use waveform based diffusion model to purify AE. Proposed by ICLR 2023: AudioPure
Ours: Use our WAVEPURIFIER to defend the AE.

C&W Attack - S&P Workshop 2018

ASR model: DeepSpeech 0.4.1-Model Description

Benign	C&W AE	Down-Up 2k	LPC-10	Quant-8	ANR	SNR	DiffSpec	DiffWave	Ours

QIN-I Attack - PMLR 2019

ASR model: LingVo ASR-Model Description

Benign	QIN-I AE	Down-Up 2k	LPC-10	Quant-8	ANR	SNR	DiffSpec	DiffWave	Ours

SpecPatch Attack - CCS 2022

ASR model: DeepSpeech 0.4.1-Model Description

Benign	SpecPatch AE	Down-Up 2k	LPC-10	Quant-8	ANR	SNR	DiffSpec	DiffWave	Ours