Models & Architecture

24 concepts

Emotion Vectors

Measurable internal neural representations inside AI models that function like emotions and causally steer the model's behavior.

Intermediate

Models & Architecture

Adaptive Thinking in AI

A reasoning strategy where AI models dynamically adjust how much they think per turn — from instant responses to deep multi-step deliberation — based on task complexity.

What Is Adversarial Cost to Exploit (ACE)?

Advanced

Models & Architecture

Adversarial Cost to Exploit (ACE)

A security benchmark that measures the economic token cost an adversary must spend to trick an AI agent into unauthorized tool use, replacing static pass/fail evaluations with game-theoretic cost analysis.

Advanced

Models & Architecture

Automated Alignment Research

Using frontier AI models to autonomously discover methods for aligning other AI systems — addressing the scalable oversight challenge by letting safety research scale with capabilities.

Advanced

Models & Architecture

DeepStack Injection

A VLM architecture that routes abstract visual features to early Transformer layers and high-resolution details to later layers for optimal document parsing in compact models.

What Is GRPO (Group Relative Policy Optimization)?

Advanced

Models & Architecture

GRPO (Group Relative Policy Optimization)

A reinforcement learning algorithm that aligns language models by comparing groups of outputs against each other, eliminating the need for a separate reward model.

Intermediate

Models & Architecture

Gemma 4

Google DeepMind's open-weight multimodal model family that natively handles text, vision, and audio on-device.

Intermediate

Models & Architecture

LoRA (Low-Rank Adaptation)

An efficient fine-tuning method that trains only small adapter layers instead of the full model

What Is Model Distillation? How Knowledge Transfer Makes AI Smaller & Faster

Intermediate

Models & Architecture

Model Distillation

Training a smaller 'student' model to replicate a larger 'teacher' model's capabilities at a fraction of the cost and latency

What Is PEFT (Parameter-Efficient Fine-Tuning)?

Intermediate

Models & Architecture

PEFT (Parameter-Efficient Fine-Tuning)

A family of techniques that adapt large AI models to specific tasks by updating only a tiny fraction of parameters, cutting fine-tuning costs by 90–99%.

What Is Perplexity in NLP? The Key Metric for Language Model Evaluation

Intermediate

Models & Architecture

Perplexity in NLP

The standard metric for evaluating language model quality — measuring how well a model predicts text, where lower values indicate better language understanding

Intermediate

Models & Architecture

Quantization

Reducing model weight precision from 16/32-bit to 8/4-bit to shrink size and speed up inference

Intermediate

Models & Architecture

RAG (Retrieval-Augmented Generation)

A technique that combines LLMs with external knowledge retrieval to improve accuracy and reduce hallucinations

RLHF (Reinforcement Learning from Human Feedback)

Advanced

Models & Architecture

RLHF (Reinforcement Learning from Human Feedback)

A training technique that uses human preference ratings to align LLM behavior with human values

Advanced

Models & Architecture

Text/Action Mismatch

A failure mode where an LLM verbally refuses a restricted request in its text output while simultaneously executing the forbidden action in its structured tool-call output.

What is a Mixture-of-Experts (MoE) model?

Advanced

Models & Architecture

Mixture-of-Experts (MoE) Model

An architecture that routes tokens to specialized sub-networks, increasing model capacity without a proportional increase in computing costs.

Intermediate

Models & Architecture

Transformer

The neural network architecture underlying all modern LLMs, using attention mechanisms to process text

Intermediate

Models & Architecture

VLM (Vision-Language Model)

An AI model architecture that jointly processes visual and textual inputs, enabling tasks like document understanding, image reasoning, and visual question answering.

What Is the Attention Mechanism? Self-Attention & Multi-Head Attention Explained

Advanced

Models & Architecture

Attention Mechanism

The mathematical mechanism that allows transformers to dynamically focus on the most relevant parts of the input when processing each token

What Is the KV Cache? How Key-Value Caching Accelerates LLM Inference

Advanced

Models & Architecture

KV Cache

A memory optimization that stores previously computed key-value pairs in transformer attention layers — avoiding redundant computation and accelerating generation 3-5×

Beginner

Models & Architecture

DeepSeek

A highly efficient, open-weight AI model family that delivers frontier-level coding and reasoning capabilities at significantly lower computational costs.

Advanced

Models & Architecture

Flash Attention

A hardware-aware algorithm that massively speeds up LLM processing by optimizing GPU memory reads, enabling very long context windows.

Advanced

Models & Architecture

Mamba

A highly efficient AI architecture that uses State-Space Models instead of Transformers to process massive amounts of text with very low memory usage.

Advanced

Models & Architecture

State-Space Model (SSM)

An efficient AI architecture that maintains a continuously updating internal state to process massive sequences of data without the memory overhead of Transformers.