Adversarial attacks on visual systems
Adversarial training: Here we generate a lot of adversarial examples and explicitly train the network on it so that our network would identify one when someone is trying to attack it.
Defensive distillation: Here we train the system on class probabilities instead of assigning them to fixed class outputs. So that the trained model will have a smooth surface in the directions where an attacker will typically target and makes it difficult for them to find which input tweaks will lead to incorrect categorization.
Reference: Source: Explaining and Harnessing Adversarial Examples, Goodfellow et al, ICLR 2015.
AI Engineer at Vancouver Automation