
What is Federated Learning?
Federated learning is a machine learning approach where a model is trained across multiple decentralized devices or servers, each holding local data, without that data ever leaving the device. Instead of centralizing data, only model updates (gradients) are shared and aggregated.
Why It Matters
Federated learning solves the fundamental tension between AI training (which needs lots of data) and privacy (which forbids centralizing sensitive data). It enables AI on medical records, financial data, and personal devices while complying with GDPR and other regulations. Apple uses it for Siri and predictive text, Google for Gboard — your phone's data never leaves your phone.
How It Works
- Initialization — a central server sends a global model to all participating devices/clients
- Local training — each device trains the model on its local data for several epochs
- Update sharing — devices send only model weight updates (gradients) to the server, not the raw data
- Aggregation — the server aggregates updates from all clients (typically using Federated Averaging / FedAvg)
- Distribution — the updated global model is sent back to all devices
- Repeat — the process continues for multiple rounds until convergence
Key challenges:
- Non-IID data — each device has different data distributions (your texting style differs from mine)
- Communication overhead — sending gradients for large models is expensive
- Device heterogeneity — devices have different compute capabilities
- Privacy attacks — gradients can still leak information (solved with differential privacy, secure aggregation)
Variants:
- Cross-device — millions of mobile devices (Google Gboard, Apple Siri)
- Cross-silo — a few organizations collaborate (hospitals sharing model updates)
- Federated analytics — compute statistics across devices without collecting data
Example
Five hospitals want to build a brain tumor detection model but can't share patient data due to HIPAA regulations. With federated learning, each hospital trains the model on its local patient scans, sends only the model updates to a coordinator, and receives back an improved model that learned from all five hospitals' experience — without any patient data leaving any hospital.