Foundations lesson

Track 01 · Foundations Novice ~8 min

Supervised, unsupervised, or reinforcement learning?

Almost every machine-learning system fits into one of three families — and the family is decided by one thing: what feedback the model learns from. Labels, hidden structure, or reward. Learn to tell them apart, then sort real tasks into the right bucket yourself, right here on the page.

Module progress

01Three families, decided by one thing

When people say "machine learning," they usually mean one of three broad paradigms. They aren't different algorithms so much as different learning settings — and the thing that separates them is the feedback signal each one learns from. Supervised learning gets labels (the right answer attached to every example). Unsupervised learning gets no labels at all and has to find structure on its own. Reinforcement learning gets a reward — a score earned by acting in an environment over time. Keep that one question in mind — "what does it learn from?" — and the rest follows.

Signal: labels

Supervised

Trained on input–output pairs where each example carries its correct answer. It learns a mapping from inputs to known targets.

Signal: structure

Unsupervised

Works on unlabeled data and discovers structure within it — grouping similar points, or compressing the data into a simpler form.

Signal: reward

Reinforcement

An agent acts in an environment and learns a policy by maximizing a cumulative reward earned through trial-and-error interaction.

The paradigms differ chiefly in their feedback signal: explicit answers (labels), no labels at all, or a scalar reward earned over time.
The boundaries can blur — semi-supervised and self-supervised learning mix labeled and unlabeled data and don't fit cleanly into the three-way split.

02Supervised — learning from labeled answers

Supervised learning trains a model on labeled examples — input–output pairs — so it learns a mapping from inputs to known target outputs. You give it thousands of emails already marked "spam" or "not spam," and it learns the pattern that connects an email to its label, so it can label new ones it has never seen. There are two classic shapes: classification, where the answer is a discrete label (spam / not spam, cat / dog), and regression, where the answer is a continuous number (tomorrow's temperature, a house price). The whole point is to learn a mapping that generalizes — which is why supervised models are always tested on held-out data they didn't train on.

Needs labels: every training example comes with its correct answer attached.
Two task types: classification (discrete labels) and regression (continuous values).
Example methods: linear and logistic regression, support vector machines, decision trees, and neural-network classifiers.

03Unsupervised — finding structure with no labels

Unsupervised learning works on unlabeled data and its job is to discover structure within it — nobody hands it the right answers. The most common version is clustering: grouping similar data points together, like sorting a pile of customer records into segments that behave alike, without anyone defining the segments in advance. The other big version is dimensionality reduction (and density estimation): compressing data into a simpler representation that keeps what matters and drops the noise. Because there's no answer key, success is harder to measure than in supervised learning — you're looking for useful patterns, not a known target.

No labels: the system is given raw data and must find the patterns itself.
Two big families: clustering (grouping by similarity) and dimensionality reduction / density estimation (compressing or representing data).
Example methods: k-means and DBSCAN clustering, and PCA for dimensionality reduction.

A note for the curious: self-supervised learning — which underpins modern large language models — generates its own training signal from unlabeled data and is usually described separately from this classic unsupervised setting.

04Reinforcement — learning by reward and trial

Reinforcement learning is the odd one out. There's no fixed dataset of answers at all. Instead, an agent takes actions in an environment and learns a policy — a strategy for what to do — by maximizing a cumulative reward signal it earns through trial-and-error interaction over time. Think of a program that plays a game thousands of times, rewarded when it wins and penalized when it loses, gradually shifting its behavior toward whatever earns more reward. A defining feature is the exploration-versus-exploitation trade-off: the agent has to balance trying new actions to gather information against exploiting the actions it already knows pay off. Landmark examples include systems that learned to play Atari games directly from the screen, and the program that mastered the board game Go through self-play.

No answer key: feedback is a scalar reward earned by acting, not a label attached to each example.
Learns a policy: a strategy mapping situations to actions, tuned to maximize reward over time.
Exploration vs exploitation: balancing new actions (to learn) against known good actions (to score).
Example methods: Q-learning, deep Q-networks (DQN), and policy-gradient methods.

05Sort the task: which paradigm fits?

Here's where it clicks. Below are real machine-learning tasks. Pick a task, then choose the bin you think it belongs in — Supervised, Unsupervised, or Reinforcement. You'll get instant feedback and a one-line reason for each, and the panel underneath shows the signal each paradigm learns from. The fastest way to decide is to ask: does this task come with labeled answers, just raw data to find structure in, or a reward earned by acting?

Interactive Pick a task, then a bin

Unsorted tasks · 8 left

Supervised

learns from labels

Unsupervised

learns from structure

Reinforcement

learns from reward

Pick a task above to start.

What signal does each paradigm learn from?

Labels

Each example carries its correct answer; the model learns the input→answer mapping.

Structure

No answers given; the model groups similar data or compresses it to reveal hidden patterns.

Reward

A scalar score earned by acting in an environment; the model tunes its policy to earn more over time.

06Check your understanding

TJS Quiz

07Take it with you & go deeper

"Supervised vs unsupervised vs reinforcement" — one-page summary

The whole lesson distilled to a printable cheat-sheet.

▸ Related lessons in the hub

Lesson

AI vs machine learning vs deep learning

Zoom out one level: how these three paradigms all sit inside the broader machine-learning circle.

Read →

Lesson

How neural networks work

The model architecture that powers supervised classifiers and deep reinforcement learners alike.

Read →

▸ Coming next — deeper progression

Lesson

RLHF: reinforcement learning from human feedback

See reinforcement learning at work in modern language models — reward signals built from human preferences.

Read →

Coming soon

Clustering, hands-on

A closer look at the most common unsupervised task — how k-means and DBSCAN actually group data.

Coming soon

★Sources & further reading

Published by Tech Jacks Solutions · Reviewed June 2026. This lesson explains established, definitional concepts and is grounded in the canonical references below; the interactive uses illustrative example tasks chosen to teach the distinction.

Reinforcement Learning: An Introduction (2nd ed.) — Richard S. Sutton & Andrew G. Barto
Deep Learning — Ch. 5: Machine Learning Basics — Goodfellow, Bengio & Courville
Pattern Recognition and Machine Learning — Christopher M. Bishop
The Elements of Statistical Learning (2nd ed.) — Hastie, Tibshirani & Friedman
Reinforcement Learning: A Survey — Kaelbling, Littman & Moore (JAIR)
scikit-learn — Supervised learning — scikit-learn developers
scikit-learn — Unsupervised learning — scikit-learn developers
ISO/IEC 22989 — AI concepts and terminology — ISO/IEC
Machine Learning Crash Course — Google

Responsible AI & editorial note

This is an educational explainer covering established machine-learning concepts. The example tasks in the interactive are illustrative teaching cases, not claims about any specific commercial product. Definitions follow the cited primary sources; where a topic is genuinely contested or evolving (for example, where self-supervised and semi-supervised learning sit relative to the classic three-way split), we say so in the text.

AI systems can produce plausible-sounding but incorrect guidance. For decisions with real-world stakes, verify against the primary sources linked above and consult a qualified professional. This lesson is a self-assessment study aid, not a professional certification.

⊕Concept map

The whole lesson on one screen: three learning paradigms, separated by the feedback signal each one learns from. Expand a branch to review the essentials.

Three families, decided by one thing

The three paradigms are different learning settings, separated by the feedback signal each learns from.
One question sorts them: what does it learn from — labels, raw structure, or a reward?
The boundaries can blur: semi-supervised and self-supervised learning don’t fit cleanly into the three-way split.

Supervised — learning from labeled answers

Trains on labeled examples — input–output pairs — learning a mapping to known target outputs.
Two task types: classification (discrete labels) and regression (continuous values).
Goal is a mapping that generalizes, so models are tested on held-out data they didn’t train on.
Example methods: linear/logistic regression, SVMs, decision trees, neural-network classifiers.

Unsupervised — finding structure with no labels

Works on unlabeled data and must discover structure itself — no answer key is given.
Two big families: clustering (grouping by similarity) and dimensionality reduction / density estimation.
Example methods: k-means and DBSCAN clustering, and PCA for dimensionality reduction.
Self-supervised learning, which underpins modern LLMs, is usually described separately from this classic setting.

Reinforcement — learning by reward and trial

No answer key: an agent acts in an environment and learns from a scalar reward earned through trial and error.
Learns a policy — a strategy mapping situations to actions — tuned to maximize cumulative reward.
Exploration vs exploitation: balancing new actions to learn against known good actions to score.
Example methods: Q-learning, deep Q-networks (DQN), and policy-gradient methods.

Sort the task: which paradigm fits?

To classify a task, ask whether it comes with labeled answers, just raw data to find structure in, or a reward earned by acting.
Labels → supervised; structure → unsupervised; reward → reinforcement.
The signal a paradigm learns from is the fastest way to decide where a task belongs.

Supervised vs unsupervised vs reinforcement learning — in 5 minutes

Tech Jacks Solutions · AI Knowledge Hub · educational summary

One question decides the family

Ask "what does the model learn from?" Supervised learns from labels, unsupervised from structure in unlabeled data, reinforcement from a reward earned by acting.

Supervised learning

Trains on labeled examples (input-output pairs) to learn a mapping to known answers. Two task types: classification (discrete labels) and regression (continuous values). Tested on held-out data so it generalizes.

Unsupervised learning

Works on unlabeled data and discovers structure: clustering (grouping by similarity) and dimensionality reduction (compressing data). No answer key, so success is harder to measure.

Reinforcement learning

An agent takes actions in an environment and learns a policy by maximizing cumulative reward through trial and error. Key tension: exploration vs exploitation. Methods: Q-learning, DQN, policy gradients.

The blur

Semi-supervised and self-supervised learning mix labeled and unlabeled data and don't fit cleanly into the three-way split.

Gallery

Contacts

Supervised, unsupervised, or reinforcement learning?

01Three families, decided by one thing

Supervised

Unsupervised

Reinforcement

02Supervised — learning from labeled answers

03Unsupervised — finding structure with no labels

04Reinforcement — learning by reward and trial

05Sort the task: which paradigm fits?

06Check your understanding

07Take it with you & go deeper

AI vs machine learning vs deep learning

How neural networks work

RLHF: reinforcement learning from human feedback

Clustering, hands-on

★Sources & further reading

⊕Concept map

Supervised vs unsupervised vs reinforcement learning — in 5 minutes

One question decides the family

Supervised learning

Unsupervised learning

Reinforcement learning

The blur

Services

Learn

Company

Gallery

Contacts

01Three families, decided by one thing

Supervised

Unsupervised

Reinforcement

02Supervised — learning from labeled answers

03Unsupervised — finding structure with no labels

04Reinforcement — learning by reward and trial

05Sort the task: which paradigm fits?

06Check your understanding

07Take it with you & go deeper

AI vs machine learning vs deep learning

How neural networks work

RLHF: reinforcement learning from human feedback

Clustering, hands-on

★Sources & further reading

⊕Concept map

→Related lessons

Supervised vs unsupervised vs reinforcement learning — in 5 minutes

One question decides the family

Supervised learning

Unsupervised learning

Reinforcement learning

The blur

Services

Learn

Company