GPU
Graphics Processing Unit — originally designed for rendering graphics, GPUs excel at the parallel mathematical operations needed for training and running AI models. They are the primary hardware for modern AI.
Why It Matters
GPU availability is a major bottleneck for AI development. The AI boom has made GPUs from NVIDIA among the most sought-after and supply-constrained technology in the world.
Example
NVIDIA's H100 GPU can perform over 4 petaFLOPS of AI computation, and training a frontier model might require thousands of these GPUs working in parallel.
Think of it like...
Like a massive kitchen with hundreds of burners compared to a single-burner stove — CPUs handle tasks one at a time, GPUs handle thousands simultaneously.
Related Terms
TPU
Tensor Processing Unit — Google's custom-designed chip specifically optimized for machine learning workloads. TPUs are designed for matrix operations that are fundamental to neural network computation.
Compute
The computational resources (processing power, memory, time) required to train or run AI models. Compute is measured in FLOPs (floating-point operations) and is a primary constraint and cost in AI development.
CUDA
Compute Unified Device Architecture — NVIDIA's parallel computing platform that enables GPU programming for AI workloads. CUDA is the dominant software ecosystem for AI computation.
Parallel Computing
Processing multiple computations simultaneously rather than sequentially. Parallel computing is fundamental to AI training and inference, which involve massive matrix operations.