How AI helps you write code
It starts as autocomplete finishing your line — and ends as an agent that reads a task, edits files, runs the tests, and fixes its own mistakes. Learn how it works, where the real value is, and why every line still needs a human review. Right here on the page.
01Autocomplete, assistant, agent
Imagine a phone keyboard that guesses your next word. Now imagine that same idea, but for code — and dialed up. The simplest AI coding help is exactly that: autocomplete that finishes the line you're typing. Turn the dial up and you get a chat assistant — you ask it to explain a function or make one specific change, and it does just that. Turn the dial all the way up and you get an autonomous coding agent: you hand it a goal in plain English — "add a logout button" — and it works on its own, editing files and running commands until it's done. The key idea is an autonomy slider: you choose how much independence to give it.
Autocomplete suggests the next bit of code as you type — one small step at a time. You stay in the driver's seat for every line.
- Autocomplete finishes your current line — fast typing help, one step at a time.
- Chat assistant answers questions and makes targeted edits you ask for.
- Agent takes a goal and acts across files and commands on its own — the most autonomy, and the most need for review.
02How does it know what to type?
Underneath, an AI coding tool is a large language model — the same kind of model that powers a chatbot, but trained on an enormous amount of code. It works by predicting the next token: the next character, symbol, or word that most likely comes next, given everything you've already written. Type def calculate_ and the model has seen so many functions that it can guess a sensible continuation. It isn't copying one file from memory and it isn't running your program — it's making an informed prediction from patterns it learned. And it doesn't look only at your one line: it reads the surrounding file too, which is how it matches your brackets and your style.
Next-token prediction, one piece at a time
The model picks the most likely next token, adds it, then predicts the one after that — building the suggestion step by step. Trained on huge amounts of code, its "most likely next piece" usually looks like working code.
It reads the code around your cursor
The prediction isn't blind to the rest of the file. It uses nearby variables, imports, and style so the suggestion fits — closing brackets correctly and reusing names you've already defined.
A likely continuation — not a guarantee
"Most likely" is not "correct." The suggestion is a prediction, so it can be plausible and still wrong. That's why you read it before you accept it — a theme we return to in section 5.
03How it finds the right code in your project
A model trained on the world's code still knows nothing about your project — your function names, your APIs, your conventions. So before it answers, a good tool goes and finds the relevant parts of your codebase. Two steps make this work. First, indexing: when you open a project, the tool scans the whole repository to learn how files, modules, and dependencies fit together. Second, semantic search: when you ask for something, it searches by meaning, not just keywords, to find the code blocks that are conceptually related to your task. The most relevant snippets are then pulled into the model's context window — its short-term working memory — so its suggestions match your real code instead of a generic template.
- Indexing maps how your repository fits together, so the tool knows where things live.
- Semantic search finds related code by meaning — surfacing files a plain keyword search would miss.
- The context window is finite. If the relevant file isn't pulled in, the model can't account for it — a real source of subtle bugs (more in section 5).
04The agentic coding loop
An autonomous coding agent doesn't write everything in one shot and hope. It runs a loop, much like a human engineer: read the task, plan, edit files, run the tests, and — if a test fails — read the error, fix the code, and run the tests again. It repeats until everything passes, then hands you a set of changes to review. Step through one fixed example below: a small task where the first attempt fails a test, and the agent recovers.
Step through the loop.
Press Step to advance one stage, or Run loop to watch the agent fail a test and recover.
Illustrative example — a fixed, simplified walkthrough to show the shape of the loop, not a recording of any specific tool.
- The loop is read → plan → edit → test → fix → repeat, ending in a reviewable set of changes.
- A failing test is feedback: the agent reads the error, fixes the code, and runs the test again rather than giving up.
- It ends by handing you a diff to approve — the human checkpoint before anything ships.
05Real value, real limits — and why you still review
AI coding tools genuinely speed up real work — and benchmarks like HumanEval (can a model write a correct standalone function?) and SWE-bench (can it fix a real GitHub issue so the tests pass?) track how much better they keep getting. But a high benchmark score is a general gauge, not a promise about your code. The limits are real. Because the model predicts likely text, it can be confidently wrong — calling an API that doesn't exist, or importing a package that was never published. That last one has a name: when a model keeps hallucinating the same fake package, an attacker can register that exact name with malicious code, so the AI's suggestion installs the attack — the community calls it "slopsquatting." Add finite context windows and the temptation to over-trust, and the takeaway is simple: treat AI-generated code as an untrusted draft until you've reviewed and tested it.
06Check your understanding
07Take it with you & go deeper
What are AI agents?
The anatomy and loop behind autonomous agents — coding is one job they do.
Read →The agentic AI loop
A deeper look at perceive → reason → act → observe.
Read →Why AI hallucinates
The root cause behind made-up APIs and packages.
Read →Tool misuse & excessive agency
The security deep dive behind the review wizard.
Read →The agentic coding loop, explained
A worked read → plan → edit → test → fix example — and where it breaks.
In the pipelineReviewing AI-generated code
Slopsquatting, slop code, and a defensive review checklist.
In the pipeline→Continue learning
Published by Tech Jacks Solutions · Reviewed June 2026. This lesson explains established concepts and is grounded in the references below; figures shown in the interactives are illustrative and labelled as such.
- GitHub Copilot documentation — GitHub
- Claude Code documentation — Anthropic
- Evaluating Large Language Models Trained on Code (Codex / HumanEval) — Chen et al. (2021)
- SWE-bench: Can Language Models Resolve Real-World GitHub Issues? — Jimenez et al. (2023)
- OWASP Top 10 for Large Language Model Applications — OWASP
AI Coding Assistants & Agents — in 5 minutes
Tech Jacks Solutions · AI Knowledge Hub · educational summary
Autocomplete → assistant → agent
An autonomy slider. Autocomplete finishes your line. A chat assistant answers questions and makes targeted edits. An autonomous agent takes a goal and edits files + runs commands on its own.
How completion works
A large language model trained on code predicts the next token (character/symbol/word) from what you've written, using the surrounding file for context. "Most likely" is not "correct" — it's a prediction.
Finding the right code
The tool indexes your repository and runs semantic (meaning-based) search to pull related code into the model's finite context window, grounding suggestions in your real APIs and style.
The agentic coding loop
Read the task → plan → edit files → run tests → if a test fails, read the error, fix, and re-run → repeat until green → present a diff for human review.
Real limits & reviewing AI code
Models can be confidently wrong: non-existent APIs, hallucinated packages (the "slopsquatting" supply-chain risk), finite context, and over-trust. Treat AI code as an untrusted draft: read the diff, run tests, verify every package/API, sandbox before production, and require human sign-off on sensitive changes. Benchmarks (HumanEval, SWE-bench) gauge general ability, not correctness on your codebase.