
What is Transfer Learning?
Transfer learning is a machine learning technique where a model trained on one task is reused as the starting point for a different but related task. Instead of training from scratch, the model transfers knowledge it already learned.
Why It Matters
Transfer learning is why you can fine-tune a powerful LLM on just a few hundred examples of your specific task and get excellent results. It's the economic engine of modern AI: the billions of dollars spent pre-training foundation models create reusable knowledge that millions of downstream users benefit from. Without transfer learning, every AI application would need its own massive training run.
How It Works
- Source task — a model is trained on a large, general dataset (e.g., all of Wikipedia for language, ImageNet for vision).
- Knowledge transfer — the trained weights (internal representations) capture general features that apply across tasks: language structure, visual edges, semantic concepts.
- Target task — the pre-trained model is adapted to a specific task by:
- Fine-tuning all parameters on new data
- Freezing most layers and only training a small output head
- Prompting (for LLMs) — no weight updates needed
- LoRA/PEFT — efficient parameter-efficient fine-tuning
Transfer learning works because early model layers learn generic features (edges, grammar rules) while later layers learn task-specific features. The generic layers transfer well across tasks.
Example
A hospital fine-tunes a pre-trained language model on 500 medical consultation transcripts to build a clinical note summarizer. The model already understands language, grammar, and medical terminology from pre-training — it only needs to learn the specific summarization format.