
What are Overfitting and Underfitting?
Overfitting and underfitting are two fundamental failure modes in machine learning that describe how well a model generalizes from training data to new, unseen data.
- Overfitting: The model memorizes the training data too precisely, including its noise and quirks, and fails to generalize to new data.
- Underfitting: The model is too simple to capture the underlying patterns in the data, performing poorly on both training and new data.
Why It Matters
The balance between overfitting and underfitting β known as the bias-variance tradeoff β is central to building effective ML systems. An overfit model gives false confidence (great on training data, terrible in production). An underfit model is useless for everyone. Every ML practitioner must navigate this tradeoff.
How It Works
Overfitting happens when:
- The model is too complex for the amount of training data
- Training runs for too many epochs
- The model has too many parameters relative to the data
Signs: training loss is very low but validation loss is high.
Remedies: more training data, regularization (L1/L2, dropout), early stopping, data augmentation, reducing model complexity.
Underfitting happens when: