
What is Training vs Inference?
Training and inference are the two fundamental phases of a machine learning model's lifecycle:
- Training is the process of teaching a model by adjusting its parameters on data.
- Inference is the process of using the trained model to make predictions on new data.
Why It Matters
The distinction between training and inference affects cost, speed, hardware requirements, and deployment strategy. Training a frontier LLM costs millions of dollars and takes months; inference (running the model for a user) costs fractions of a cent per query and takes seconds. Understanding this distinction is essential for evaluating AI costs and capabilities.
How It Works
Training:
- Goal: Learn patterns from data by adjusting model weights
- Process: Forward pass β compute loss β backpropagation β weight update (repeated billions of times)
- Compute: Extremely intensive. Frontier models use thousands of GPUs for months
- Cost: GPT-4 training reportedly cost $100M+; Gemini Ultra likely similar
- Happens: Once (or periodically for retraining)
- : GPU/TPU clusters optimized for throughput