Artificial Intelligence

Adversarial Attack

An input deliberately crafted to fool an AI model into making incorrect predictions. Adversarial examples often look normal to humans but cause models to fail spectacularly.

Why It Matters

Adversarial attacks expose fundamental weaknesses in AI systems. A self-driving car that can be fooled by a sticker on a stop sign is a safety-critical vulnerability.

Example

Adding an imperceptible noise pattern to a photo of a panda that causes a classifier to confidently identify it as a gibbon — the image looks unchanged to humans.

Think of it like...

Like an optical illusion that tricks the human eye — adversarial examples exploit the model's 'perception' weaknesses in ways that are invisible to us.

Related Terms