WAVEPURIFIER: Purify the Audio Adversarial Examples Using Hierarchical Diffusion Models
We present audio examples for our purification output in three attacks. We compare the result with existing defense methods, and present the transcriptions on ASR (Aotomated Speech Recognition) models.
For each audio, we provide the trans criptions under the audio.
[Code] [PDF] [Cite]
- Benign: The benign audio.
- XX AE: The AE crafted by different audio adversarial attacks.
- Down-Up 2k: Make a downsampling and upsampling to 2kHz to defend the AE. Proposed by USENIX 2021: WaveGuard
- LPC 10: Use Linear Predictive Coding with 10th oreder filter to defend the AE. Proposed by USENIX 2021: WaveGuard
- Quant 8: Quantilize the audio samples in 8 bits and then reconstruct back to defend the AE. Proposed by USENIX 2021: WaveGuard
- ANR: Use adaptive noise reduction to defend the AE. Proposed by PLoS computational biology
- SNR: Use stationary noise reduction to defend the AE. Proposed by PLoS computational biology
- DiffSpec: Use Mel-spectrogram based diffusion model to purify AE. Proposed by ICLR 2023: AudioPure
- DiffWave: Use waveform based diffusion model to purify AE. Proposed by ICLR 2023: AudioPure
- Ours: Use our WAVEPURIFIER to defend the AE.
ASR model: DeepSpeech 0.4.1-Model Description
Benign |
C&W AE |
Down-Up 2k |
LPC-10 |
Quant-8 |
ANR |
SNR |
DiffSpec |
DiffWave |
Ours |
ASR model: LingVo ASR-Model Description
Benign |
QIN-I AE |
Down-Up 2k |
LPC-10 |
Quant-8 |
ANR |
SNR |
DiffSpec |
DiffWave |
Ours |
SpecPatch Attack - CCS 2022
ASR model: DeepSpeech 0.4.1-Model Description
Benign |
SpecPatch AE |
Down-Up 2k |
LPC-10 |
Quant-8 |
ANR |
SNR |
DiffSpec |
DiffWave |
Ours |