Skip to main content
BVDNETBVDNET
ServicesWorkLibraryAboutPricingBlogContact
Contact
  1. Home
  2. AI Woordenboek
  3. Core Concepts
  4. What is a Loss Function?
book-openCore Concepts
Intermediate
2026-W17

What is a Loss Function?

A loss function measures how wrong a model's predictions are, providing the error signal that training algorithms minimize to improve the model.

Also known as:
cost function
objective function
verliesfunctie
kostenfunctie
AI Intel Pipeline
What is a Loss Function?

What is a Loss Function?

A loss function (also called a cost function or objective function) is a mathematical function that measures how far a model's predictions are from the actual target values. It provides the error signal that training algorithms use to improve the model.

Why It Matters

The loss function defines what "learning" means for a model. It's the quantity that training minimizes. Choosing the right loss function is crucial: it shapes what the model optimizes for and directly affects model behavior. For LLMs, the cross-entropy loss on next-token prediction is what drives the model to learn language.

How It Works

A loss function takes two inputs:

  • Prediction — what the model output
  • Target — what the correct answer is

It returns a single number (the loss) representing how wrong the prediction is. Training algorithms (gradient descent + backpropagation) then adjust model weights to minimize this number.

Common loss functions:

  • Cross-entropy loss — standard for classification and language modeling. Measures the difference between predicted probability distribution and actual distribution.
  • Mean Squared Error (MSE) — standard for regression. Averages the squared differences between predictions and targets.
  • Binary cross-entropy — for binary classification (yes/no, spam/not spam).
  • Contrastive loss — for learning embeddings, pushing similar items together and dissimilar items apart.
  • KL divergence — measures how one probability distribution differs from another, used in variational autoencoders and distillation.

Example

When training GPT on next-token prediction, the cross-entropy loss measures: "The model predicted 'cat' with 10% probability but the actual next word was 'cat'." The loss is high because the model was not confident in the right answer. After many training steps, the model learns to assign higher probability to correct tokens, driving the loss down.

Sources

  1. Google ML Crash Course – Loss Functions
  2. PyTorch – Loss Functions

Need help implementing AI?

I can help you apply this concept to your business.

Get in touch

Related Concepts

Tokenizer
A tokenizer converts raw text into tokens — the discrete units a language model processes — using subword algorithms like BPE or SentencePiece.
Artificial Intelligence (AI)
Artificial intelligence is the field of computer science that builds systems capable of performing tasks normally requiring human intelligence, such as learning, reasoning, and perception.
Batch Size
Batch size (examples per update) and learning rate (step size for weight updates) are the two most important hyperparameters controlling how neural networks train.
Benchmark (AI Evaluation)
A benchmark is a standardized test used to measure and compare AI model performance, providing reproducible scores across tasks like reasoning, coding, and knowledge.

AI Consulting

Need help understanding or implementing this concept?

Talk to an expert
Previous

LoRA (Low-Rank Adaptation)

Next

Machine Learning (ML)

BVDNETBVDNET

Web development and AI automation. Done properly.

Company

  • About
  • Contact
  • FAQ

Resources

  • Services
  • Work
  • Library
  • Blog
  • Pricing

Connect

  • LinkedIn
  • Email

© 2026 BVDNET. All rights reserved.

Privacy Policy•Terms of Service•Cookie Policy