Learning vertical

Track 02 · Foundations Intermediate ~8 min

Embeddings & vector search: meaning as geometry

An embedding turns a piece of text into a vector — a list of numbers that captures its meaning — so that text with similar meaning lands near it in space. A vector store indexes those vectors, and vector search finds the nearest ones to a query. That single idea powers semantic search and the retrieval step in RAG. Explore the space, the cosine measure, and the pipeline, right here on the page.

Module progress

01What an embedding is

Imagine giving every piece of text a map pin, placed so that things which mean similar things sit close together and unrelated things sit far apart. That's the whole idea here: turning meaning into location. The technique that does it is called an embedding — it converts a piece of text (a word, a sentence, a whole paragraph) into a vector, which is just a fixed-length list of numbers. Those numbers are produced so they capture the text's meaning rather than its exact spelling: two passages that mean similar things get similar numbers, and passages about unrelated topics get very different numbers. Read each number as a coordinate, and every piece of text becomes a single point in space — real models use hundreds or thousands of dimensions, far more than we can draw.

Text in, a vector of numbers out — the same model embeds both your documents and your query the same way.
The vector encodes meaning, not spelling: "car" and "automobile" land close even though they share no letters.
Because meaning becomes geometry, you can compare texts with simple math instead of matching words.

02Close in space = close in meaning

If similar meanings get similar vectors, then nearby points mean similar things. Below is a tiny, hand-placed map of words in a flattened 2-D space (real embeddings have far more dimensions — this is an illustration, not a model's output). Click a word to see its nearest neighbours light up and read a similarity score for each. Notice the clusters: royalty, animals, and vehicles each huddle together.

InteractiveClick a word — or Tab + Enter to pick one

Pick a word

Nearest neighbours

Click any word in the map. Its closest neighbours (smallest distance) are its most similar in meaning — and they'll almost always be from the same cluster.

cosine similarity compares the direction of two vectors: 1.0 = same direction (most similar), 0 = unrelated, −1 = opposite — for text it is usually positive. It's the measure most vector search uses.

Two ways to measure "closeness" come up constantly. Distance (how far apart two points are) and cosine similarity (whether two vectors point the same way). Cosine ignores how long the vectors are and looks only at direction, which is why it's the default for comparing meaning — a short note and a long document about the same topic can still point the same way.

03Where the vectors live

Once your text is embedded, the vectors need a home that can search by similarity. That home is a vector store (often called a vector database). It does three jobs: it stores each vector with a reference back to its original text, it builds an index so it doesn't have to compare every vector one by one, and it answers a query by returning the closest vectors. Tap each block to see what it does.

ExploreTap a building block

The vector store (store → index → query)

Storevector + reference

Index (ANN)group similar

Queryembed the question

Metriccosine / distance

Job 1 — keeping the vectors

Store

Every chunk's embedding is saved alongside a reference to the original text (and any metadata you want to filter on later, like source or date). The vector is what gets searched; the reference is what lets the system hand you back the real passage once a match is found.

04Nearest-neighbour search in action

Finding the closest vectors to a query is a nearest-neighbour search. Checking every vector exactly is slow at scale, so vector stores use approximate nearest neighbour (ANN) indexes that trade a little accuracy for a lot of speed. That fast meaning-based lookup is what powers semantic search and the retrieval step inside RAG.

ExploreSwitch view

Semantic search — match meaning, not words

Embed the query, find the nearest vectors, return their original passages. Because matching is by meaning, a search for "how do I get my money back" can surface a passage titled "Refund policy" even with no shared keywords.

step 1 embed the query with the same model

step 2 ANN search returns the closest vectors

step 3 show the texts those vectors point to

Powering RAG — this is the "retrieve" step

RAG (Retrieval-Augmented Generation) fetches relevant documents before a language model answers. The fetch is exactly this vector search: embed the question, pull the nearest chunks from the store, and hand them to the model as context so the answer is grounded in real sources.

retrieve nearest chunks from the vector store

augment add those chunks to the prompt

generate the model answers from them

vs keyword search — different strengths

Classic keyword search matches the exact words you type — great for names, codes, and precise terms, but it misses synonyms and paraphrases. Vector search matches meaning — great for natural-language questions, but it can blur very specific terms. Many real systems combine both (often called hybrid search).

keyword exact terms, codes, names → precise

vector synonyms, paraphrases, intent → flexible

hybrid combine the two for the best of each

05The pipeline — and where it breaks

Putting it together, building a searchable knowledge base is a short pipeline: chunk your documents into passages, embed each chunk with an embedding model, and store the vectors in a vector index. At query time you embed the query the same way and run a nearest-neighbour search. The same embedding model must be used on both sides — vectors from different models aren't comparable.

Chunk — split long documents into smaller passages so a match points to just the relevant part, and the text fits in a prompt.
Embed — turn each chunk (and later, each query) into a vector with the same model.
Store & index — load the vectors into a vector store that can search them by similarity.
Search — embed the query, return the nearest chunks, and use them downstream (display them, or feed them to a model for RAG).

Embeddings are powerful but not magic. A few honest limits worth knowing:

Quality depends on the model. Embeddings reflect what their model learned; a general model may struggle with specialised jargon or a language it saw little of.
Chunking choices matter. Chunks that are too big blur several ideas into one vague vector; too small and they lose context. The right size depends on your content.
"Close" isn't always "correct." The nearest vector is the most similar, which is usually but not always the most relevant — retrieval can surface a plausible-looking but unhelpful passage.
Approximate search trades some accuracy. ANN indexes are fast because they may occasionally miss a true nearest neighbour. Usually a worthwhile trade, but worth remembering.

06Test your knowledge

TJS Quiz

Certificate of Completion

'+esc(D.topic||'Quiz')+'

This recognizes

'+(name||'—')+'

for completing the assessment at the '+esc(cat)+' level ('+pct+'%).

'+ds+' · TJS AI Knowledge Hub · ID '+id+'

A self-assessment summary recognizing completion of an educational module — not a professional certification.

window.onload=function(){window.print();}<\/scr'+'ipt>'; var w=window.open('','_blank'); if(w){ w.document.write(html); w.document.close(); } } renderStart(); })();

07Take it with you & go deeper

"Embeddings & vector search in 5 minutes" — one-page summary

The whole module distilled to a printable cheat-sheet.

▸ Already on the site — go deeper

Glossary

Embeddings — AI Glossary

The concise definition of an embedding, plus related terms, in the AI Glossary.

Open →

Glossary

RAG — AI Glossary

How retrieval-augmented generation uses vector search as its retrieve step.

Open →

▸ Coming next — deeper progression

Coming soon

Chunking strategies (deep dive)

How chunk size and overlap shape retrieval quality, with patterns for different content types.

In the pipeline

Coming soon

Building a RAG system

From documents to grounded answers — wiring embeddings and vector search into a working pipeline.

In the pipeline

→Continue learning

Sources & review

Published by Tech Jacks Solutions · Reviewed June 2026. This lesson explains established concepts and is grounded in the references below; figures shown in the interactives are illustrative and labelled as such.

Vector embeddings — Pinecone (learn)
Efficient Estimation of Word Representations in Vector Space (word2vec) — Mikolov et al. (2013)
Text embedding models — LangChain
Vector stores — LangChain
Retrieval-augmented generation (RAG) — Pinecone (learn)
Retrievers — LangChain

Embeddings & vector search — in 5 minutes

Tech Jacks Solutions · AI Knowledge Hub · educational summary

What an embedding is

An embedding turns a piece of text into a vector — a fixed-length list of numbers produced by an embedding model so that the numbers capture meaning. Text with similar meaning gets similar numbers, so every passage becomes a point in a high-dimensional space.

Close in space = close in meaning

Because similar meanings land near each other, you can compare texts with simple geometry. Distance measures how far apart two points are; cosine similarity measures whether two vectors point the same direction (1.0 = same direction, 0 = unrelated, −1 = opposite) — it's the measure most vector search uses.

The vector store

A vector store (vector database) holds each vector with a reference to its original text, builds an index so it need not compare every vector one by one, and answers a query by returning the closest vectors using a similarity metric.

Vector search → semantic search & RAG

Finding the closest vectors is a nearest-neighbour search; at scale, approximate nearest-neighbour (ANN) indexes trade a little accuracy for speed. This meaning-based lookup powers semantic search (match meaning, not exact words) and the "retrieve" step of RAG.

Pipeline & limits

Pipeline: chunk documents → embed each chunk → store & index → embed the query → search for nearest chunks (use the same embedding model on both sides). Limits: quality depends on the model; chunk size matters; "closest" is not always "most relevant"; ANN trades a little accuracy for speed.

Gallery

Contacts