Anomaly Detection

Note

Updated: April 22, 2025

🚨 Introduction to Anomaly Detection

Anomaly detection is about finding things that don’t belong.

These could be:

We want to identify rare, unusual patterns that are different from the normal data — without necessarily having labels for them.

An anomaly (or outlier) is a data point that is significantly different from the rest of the dataset.

There are three common types:

Point anomalies — a single abnormal value (e.g. a 10,000 dollar charge when most are under 100 dollars)
Contextual anomalies — normal in one context, strange in another (e.g. 40°C is normal in summer, not in winter)
Collective anomalies — a group of points is strange together (e.g. multiple failed logins in a row)

In most real-world cases:

So we often use unsupervised algorithms to find anomalies based on the structure of the data.

The core idea is:

"Learn what normal looks like. Then flag anything that’s far from it."

Common unsupervised strategies include: