What is Transfer Learning?

Transfer learning is a machine learning technique where a model trained on one task is reused as the starting point for a different but related task. Instead of training from scratch, the model transfers knowledge it already learned.

Why It Matters

Transfer learning is why you can fine-tune a powerful LLM on just a few hundred examples of your specific task and get excellent results. It's the economic engine of modern AI: the billions of dollars spent pre-training foundation models create reusable knowledge that millions of downstream users benefit from. Without transfer learning, every AI application would need its own massive training run.

How It Works

Source task — a model is trained on a large, general dataset (e.g., all of Wikipedia for language, ImageNet for vision).
Knowledge transfer — the trained weights (internal representations) capture general features that apply across tasks: language structure, visual edges, semantic concepts.
Target task — the pre-trained model is adapted to a specific task by:

Fine-tuning all parameters on new data
Freezing most layers and only training a small output head
Prompting (for LLMs) — no weight updates needed
LoRA/PEFT — efficient parameter-efficient fine-tuning

Transfer learning works because early model layers learn generic features (edges, grammar rules) while later layers learn task-specific features. The generic layers transfer well across tasks.

Example

A hospital fine-tunes a pre-trained language model on 500 medical consultation transcripts to build a clinical note summarizer. The model already understands language, grammar, and medical terminology from pre-training — it only needs to learn the specific summarization format.