What is a Classifier?

A classifier is a machine learning model that predicts which category (class) a given input belongs to. It's one of the most fundamental ML concepts: given an input, assign it a label from a predefined set. Spam detection, sentiment analysis, image recognition, and medical diagnosis are all classification tasks.

Why It Matters

Classification is the most widely deployed form of machine learning. Every spam filter, content moderation system, product recommendation engine, fraud detector, and image tagger uses classifiers. Understanding classification is essential for understanding how ML creates value in practice — and it's the foundation upon which more complex AI systems are built.

How It Works

Types of classification:

Binary classification — two possible classes:

Spam / not spam
Fraudulent / legitimate
Positive / negative sentiment

Multi-class classification — multiple classes, one correct:

Image → cat / dog / bird / fish
Document → sports / politics / technology / entertainment

Multi-label classification — multiple classes can be correct:

A movie can be both "action" and "comedy"
A news article can cover "politics" and "economy"

Common classifier algorithms:

Logistic Regression — simple, interpretable, good baseline
Decision Trees / Random Forest — tree-based splitting rules
Support Vector Machines (SVM) — find optimal decision boundaries
Neural Networks — deep learning classifiers for complex patterns
Naive Bayes — probabilistic, fast, good for text
LLMs as classifiers — prompt an LLM with "Classify this text as..." (zero-shot classification)

Evaluation metrics:

Accuracy — % of correct predictions
Precision — of predicted positives, how many were actually positive?
Recall — of actual positives, how many were found?
F1 Score — harmonic mean of precision and recall
Confusion matrix — table showing all correct and incorrect predictions

The classification pipeline:

Collect labeled data (input + correct class)
Extract features (or use embeddings)
Train the classifier
Evaluate on held-out test data
Deploy and monitor

Example

Gmail's spam filter is a binary classifier: for each incoming email, it extracts features (sender, subject, content, links) and predicts "spam" or "not spam." It was trained on millions of labeled emails and achieves >99.9% accuracy — classifying billions of emails daily.