Fine-Tuning

LLMs

Continuing training on a pretrained model with a smaller, task-specific dataset to specialize its behavior.


In one line

Take a pretrained model and keep training on a smaller curated dataset so it adapts to your task, domain, or style.

What it actually means

Start from the weights of a model that has already learned general language or vision. Freeze some layers or not, pick a small learning rate, and train on your own data for a few epochs. You can fine-tune all parameters (full fine-tuning, expensive), a subset (freeze backbone, train head), or a tiny set of adapter parameters (LoRA / QLoRA). The choice is a tradeoff between quality, training cost, and how many variants you want to ship.

Why it matters

Fine-tuning is how you bake a specific behavior — a tone, a structured output format, a tool-use pattern — into the weights so you don’t have to pay for it in prompt tokens on every request. It’s not good for teaching new factual knowledge (use RAG) but it’s very good at shaping style and format reliability.

Example

from peft import LoraConfig, get_peft_model
lora = LoraConfig(r=16, lora_alpha=32, target_modules=["q_proj", "v_proj"])
model = get_peft_model(base_model, lora)
# now train as usual — only a few MB of weights get updated

You’ll hear it when

  • Deciding between prompting, fine-tuning, and RAG.
  • Picking LoRA vs full fine-tuning for a project.
  • Discussing the instruction-tuning pipeline for a base model.
  • Shipping a model that needs a very specific output format.

Related terms

See also