Gallery

Contacts

405 W. Greenlawn Ave Lansing, Michigan 48910

contact@techjacksolutions.com

+1-616-320-4064

Learning vertical
Track 03 · Applied & Agentic Intermediate ~8 min

Customizing a model: prompt, RAG, or fine-tune

There are three ways to bend an off-the-shelf model to your needs — write a better prompt, give it your sources with RAG, or fine-tune its weights. Each fixes a different gap, and reaching for the heaviest one first is the classic mistake. Learn what each lever does, when to use it, and the simple rule of thumb: start with prompting, add RAG, fine-tune only if needed.

Module progress
0%

01What "customizing a model" means

Buying an off-the-rack suit and having a tailor take it in is far easier than sewing one from scratch — and you still end up with something that fits you. Customizing an AI model works the same way: you start from a capable, ready-made model that's decent at many things, and adjust what it produces for your specific use case instead of building one from nothing. You almost never start from random weights; you start from a capable model and steer it. There are three levers to do that, from lightest to heaviest: prompting, RAG, and fine-tuning.

  • Two of the three levers — prompting and RAG — change the output without touching the weights at all.
  • Only fine-tuning changes the model itself, by continuing its training on examples.
  • The art is matching the lever to the gap: is it a knowledge gap, or a behavior gap?

02The three levers

Think of customization as three dials you can turn. Prompting shapes the output at request time with instructions, context, and examples. RAG adds knowledge at answer time by retrieving your documents. Fine-tuning changes the model's behavior by training it on examples. Tap each lever to see exactly what it changes — and what it leaves alone.

ExploreTap a lever
From lightest to heaviest customization
Promptingrequest time
RAGanswer time
Fine-tuningchanges weights
Combinemix & match
Lever 1 — lightest

Prompting

Shape the output at request time: write clearer instructions, add context, give a few examples (few-shot), or constrain the format. The model and its weights are unchanged. It's the cheapest and fastest lever — and the recommended first step for almost any task.

03Which method should you use?

The right lever depends on the gap you're closing. Pick the need that sounds most like yours — the explorer highlights the recommended lever, tells you why in one line, and shows how all three compare for that need. These are teaching defaults, not hard rules: the levers can be combined.

ExplorePick what you need to do
I need to…

04What fine-tuning actually does

Fine-tuning is the only lever that changes the model itself. It continues training the model on a curated set of example input/output pairs, so the desired pattern is learned into the weights and becomes part of the model. That's the sharp contrast with RAG: RAG leaves the weights untouched and instead adds retrieved knowledge to the prompt at answer time. Put simply — fine-tuning changes behavior; RAG changes available knowledge. One re-trains the model; the other just hands it better context when a question arrives.

  • Fine-tuning bakes a tone, format, or skill into the weights — update it by training again.
  • RAG bakes nothing into the model — update it by editing the knowledge base.
  • Fast-changing facts suit RAG; consistent behavior suits fine-tuning. They combine cleanly.

05How fine-tuning is done: SFT, LoRA & PEFT

"Fine-tuning" is an umbrella. The most common form is supervised fine-tuning (SFT) — and full fine-tuning every parameter is expensive, so most teams use parameter-efficient methods like LoRA instead. Switch between the three views to see what each one is, at a high level.

ExploreSwitch view

SFT & instruction tuning — learn from examples

Supervised fine-tuning (SFT) trains the model on pairs of inputs and desired outputs — the simplest and most common way to adapt a model to a target dataset. Instruction tuning is SFT on instruction-and-response examples, which teaches a base model to follow instructions and hold a conversation.

trains on input → desired-output pairs (fully supervised)
instruction tuning teaches "follow instructions / chat" from examples
changes the model's behavior, baked into the weights

LoRA / PEFT — fine-tune without training everything

PEFT (parameter-efficient fine-tuning) adapts a model by training only a small number of (often added) parameters while freezing the rest — cutting compute and storage cost, often with comparable quality. LoRA is the popular method: it freezes the pretrained weights and injects small trainable low-rank matrices, training only those. You can keep multiple lightweight task adapters, with no added inference latency once merged.

PEFT train a few parameters, freeze the rest
LoRA frozen base + small low-rank update matrices
perk swappable per-task adapters, cheaper to train & store

The tradeoffs — cost, data, maintenance

The levers line up along cost and upkeep. Prompting: lowest cost and data, instant to change, changes nothing in the model. RAG: moderate setup, update by editing the knowledge base — facts stay fresh and traceable. Fine-tuning: highest data, compute, and ongoing maintenance, and not suited to fast-changing facts. PEFT/LoRA lowers fine-tuning's compute cost, but doesn't erase the data and maintenance burden.

prompting cheapest · iterate in seconds · model unchanged
RAG update the docs, not the model · fresh + traceable
fine-tuning heaviest · re-train to update · PEFT/LoRA trims compute

06Check your understanding

TJS Quiz
window.onload=function(){window.print()}<\/scr'+'ipt>'; var w=window.open('','_blank'); if(w){ w.document.write(html); w.document.close(); } } function accentHex(){ var v=getComputedStyle(root).getPropertyValue('--tjq-accent').trim(); return v||'#2095e9'; } function dlCanvas(cv){ var a=document.createElement('a'); a.download=(D.id||'quiz')+'-result.png'; a.href=cv.toDataURL('image/png'); a.click(); } function shareCard(pct,cat){ var cv=$('#tjqCardCv'); if(!cv||!cv.getContext) return; var x=cv.getContext('2d'),W=cv.width,H=cv.height,acc=accentHex(); var g=x.createLinearGradient(0,0,W,H); g.addColorStop(0,'#0E1F40'); g.addColorStop(1,'#10294f'); x.fillStyle=g; x.fillRect(0,0,W,H); x.save(); x.globalAlpha=.16; x.fillStyle=acc; x.beginPath(); x.arc(W*.85,H*.16,160,0,7); x.fill(); x.restore(); x.fillStyle='rgba(255,255,255,.55)'; x.font='600 21px DM Sans, sans-serif'; x.fillText('TJS QUIZ · AI KNOWLEDGE HUB',58,76); x.fillStyle='#fff'; x.font='700 60px Fraunces, serif'; x.fillText(D.topic||'Quiz',56,168); x.fillStyle=acc; x.font='700 28px "Space Mono", monospace'; x.fillText(String(cat||'').toUpperCase(),58,H-150); x.fillStyle='#fff'; x.font='700 104px "Archivo Black", sans-serif'; x.fillText(pct+'%',54,H-52); x.fillStyle='rgba(255,255,255,.55)'; x.font='400 21px DM Sans, sans-serif'; x.fillText('scored on the '+(D.topic||'')+' quiz',58,H-22); x.strokeStyle=acc; x.lineWidth=8; x.strokeRect(0,0,W,H); if(cv.toBlob && navigator.canShare){ cv.toBlob(function(blob){ try{ var file=new File([blob],'quiz-result.png',{type:'image/png'}); if(navigator.canShare({files:[file]})){ navigator.share({files:[file],title:'My quiz result',text:'I scored '+pct+'% ('+cat+') on the '+(D.topic||'')+' quiz.'}).catch(function(){dlCanvas(cv);}); return; } }catch(e){} dlCanvas(cv); }); } else dlCanvas(cv); } function certPrint(pct,cat){ var raw=(($('#tjqCertName')||{}).value)||''; var name=esc(raw.trim()); var ds=new Date().toLocaleDateString(undefined,{year:'numeric',month:'long',day:'numeric'}); var id='TJQ-'+String(Math.floor(Math.random()*1e9)); var acc=accentHex(); var html='Certificate
Certificate of Completion

'+esc(D.topic||'Quiz')+'

This recognizes

'+(name||'—')+'

for completing the assessment at the '+esc(cat)+' level ('+pct+'%).

'+ds+' · TJS AI Knowledge Hub · ID '+id+'

A self-assessment summary recognizing completion of an educational module — not a professional certification.

window.onload=function(){window.print();}<\/scr'+'ipt>'; var w=window.open('','_blank'); if(w){ w.document.write(html); w.document.close(); } } renderStart(); })();

07Take it with you & go deeper

"Prompt vs RAG vs fine-tune" — one-page summary
The whole module distilled to a printable cheat-sheet.
▸ Already on the site — go deeper
▸ Coming next — deeper progression
Coming soon

Fine-tuning deep dive (SFT & LoRA)

How to prepare a dataset, run supervised fine-tuning, and use LoRA adapters — with the data and maintenance trade-offs spelled out.

In the pipeline
Coming soon

Evaluating before you fine-tune

Why good evals come first — how to measure whether prompting or RAG already solved it before spending on training.

In the pipeline

Continue learning

Sources & review

Published by Tech Jacks Solutions · Reviewed June 2026. This lesson explains established concepts and is grounded in the references below; figures shown in the interactives are illustrative and labelled as such.

Customizing a model: prompt vs RAG vs fine-tune — in 5 minutes

Tech Jacks Solutions · AI Knowledge Hub · educational summary

What it means

Customizing a model means changing what an off-the-shelf model produces for your use case — without building one from scratch. Three levers, from lightest to heaviest: prompting, RAG, and fine-tuning.

The three levers

Prompting — shape the output at request time with instructions, context, and examples. The model is unchanged; cheapest and fastest. RAG — add knowledge at answer time by retrieving your documents; the weights are unchanged; update by editing the knowledge base. Fine-tuning — continue training the model on example input/output pairs so behavior is baked into the weights; heaviest to build and maintain.

What fine-tuning does (vs RAG)

Fine-tuning adjusts the model's weights by training on examples — it changes behavior. RAG leaves the weights untouched and adds retrieved knowledge to the prompt — it changes available knowledge. Fine-tuning re-trains the model; RAG just hands it better context.

SFT, LoRA & PEFT

SFT (supervised fine-tuning) trains on input/output pairs; instruction tuning teaches a base model to follow instructions. PEFT trains only a small set of parameters and freezes the rest to cut cost; LoRA freezes the base weights and trains small low-rank update matrices, allowing swappable per-task adapters with no added inference latency once merged.

How to choose

Start with prompting (cheapest, fastest). If the gap is missing or fast-changing knowledge, add RAG. Fine-tune only if the gap is behavior/style/format/skill that prompting and RAG can't reliably achieve. The levers combine — e.g. fine-tune for behavior, RAG for facts. Build good evaluations before investing in fine-tuning.

Before you act on AI output. This is an educational module. AI systems can produce plausible-sounding but incorrect guidance. For decisions that carry real consequences — security, legal, financial, medical, or compliance — verify with a qualified professional before acting. The customization examples here are illustrative teaching scenarios, not measured results or claims about any specific product or vendor. External links are provided for learning and may change; confirm against the official source. See sources.json for grounding and editorial cautions.