
What is Bias in Machine Learning?
Bias in machine learning refers to systematic errors that cause a model to produce unfair, inaccurate, or discriminatory results. It can originate from training data, model design, or the way results are interpreted and applied.
Why It Matters
AI systems increasingly make decisions that affect people's lives — hiring, lending, criminal justice, healthcare. When these systems contain bias, they can perpetuate or amplify existing societal inequalities at scale. Understanding and mitigating ML bias is critical for building AI that is fair, trustworthy, and beneficial.
How It Works
Bias enters ML systems at multiple stages:
Data bias (most common):
- Historical bias — training data reflects past discrimination (e.g., hiring data from a biased era)
- Representation bias — certain groups are underrepresented in training data
- Measurement bias — features are measured differently across groups
- Labeling bias — human annotators inject their own biases into labels
Algorithmic bias:
- Models amplify patterns in data, including discriminatory ones
- Optimization objectives may not account for fairness
- Proxies: even without explicit protected attributes, models find correlated features (zip code as proxy for race)
Deployment bias:
- A model trained for one context is applied in a different one
- Results are interpreted without considering the model's limitations
Mitigation strategies include:
- Auditing training data for representation and historical bias
- Fairness-aware training objectives
- Regular bias testing across demographic groups
- Human-in-the-loop review for high-stakes decisions
- Transparency and documentation (model cards, datasheets)
Example
Amazon built a hiring tool trained on 10 years of resumes. Because the tech industry historically skewed male, the model learned to penalize resumes containing the word "women's" (e.g., "women's chess club"). Amazon scrapped the tool when this bias was discovered.