Prompt Tuning
A parameter-efficient fine-tuning technique that prepends learnable 'soft prompt' tokens to the input while keeping the main model weights frozen. Only the soft prompt parameters are trained.
Why It Matters
Prompt tuning achieves near full fine-tuning performance at a fraction of the cost. Each task gets its own tiny set of learnable parameters while sharing one base model.
Example
Training just 20 learnable token embeddings (a few KB) that are prepended to every input, adapting a frozen 70B model to a specific task without touching its weights.
Think of it like...
Like adding a personalized cover letter to a standard resume template — the template (model) stays the same, but the cover letter (soft prompt) customizes it for each application.
Related Terms
Fine-Tuning
The process of taking a pre-trained model and further training it on a smaller, domain-specific dataset to specialize its behavior for a particular task or domain. Fine-tuning adjusts the model's weights to improve performance on the target task.
LoRA
Low-Rank Adaptation — a parameter-efficient fine-tuning technique that freezes the original model weights and adds small trainable matrices to each layer. It dramatically reduces the compute and memory needed for fine-tuning.