Three buzzwords that are really one story
If you've followed AI for the last couple of years, you've watched the vocabulary shift under your feet. First everyone was a prompt engineer. Then prompt engineering was supposedly dead and it was all about context. Now the smart money talks about building the harness. Each turn arrives with a confident announcement that the last thing is over.
It isn't. These aren't three fashions replacing each other. They're three layers of the same thing, and each new one was built on top of the one before. The story of getting good results from AI is the story of the question slowly getting bigger: from how do I word this? to what does the model need to see? to what whole system should it work inside? Once you see that, the buzzword churn stops being confusing and starts being a map.
The confusion is understandable. Every time the field finds a new lever to pull, someone declares the old lever obsolete, because the new one feels so much more powerful. But "more powerful" and "replaces" are different claims. A car engine is more powerful than a bicycle's chain, yet the engine still turns gears and gears still turn a chain of cause and effect. The new layer didn't throw out the old one. It wrapped it.
Wording, briefing, workshop
Picture asking a sharp new colleague to do something.
The first thing you control is how you phrase the request. "Make it better" sends them in a hundred directions; "shorten the intro to two sentences and keep the examples" sends them in one. That's prompt engineering — getting the wording of the ask itself clear, complete, and unambiguous. It was the first thing people discovered, because it's the most visible. You type words; you get words back; obviously the words you type matter.
But a request never lands in a vacuum. Your colleague also needs the relevant document, last week's decision, the example of what "good" looks like. Hand them the request with the wrong file open and even a perfectly worded ask goes sideways. Curating all of that — the files, the prior conversation, the examples, the facts pulled in from elsewhere — is context engineering. It's not about the sentence; it's about the whole briefing folder the sentence sits inside.
And even a well-briefed colleague needs a place to work: tools to reach for, the habit of checking their work, someone to catch a mistake before it ships. Building that whole environment around the model — the tools it can call, the way it reads your files, the retries when a step fails, the quality checks at the end — is harness engineering. The model is the brain; the harness is the workshop the brain works in.
Here's the part the announcements miss. These don't sit side by side as alternatives. They nest. The prompt lives inside the context, and the context lives inside the harness.

A harness, when it runs, assembles a context. That context, at its heart, contains a prompt. You can't have the outer layer without the inner ones still doing their job. Which is exactly why "prompt engineering is dead" was always wrong: it didn't die, it got absorbed. It became the innermost layer of something larger.
The nesting matters in a practical way, not just a tidy one. A weak inner layer quietly caps how good the outer ones can be. The most sophisticated harness in the world still feeds the model a context, and if the request buried inside that context is mushy, no amount of clever tool-calling rescues it — the system just executes your confusion very efficiently. Garbage in, confident garbage out. The layers amplify each other, which means they also amplify each other's flaws. That's why the people building the slickest harnesses are often the most fanatical about the quality of the prompt sitting at the core of them.
How each wave grew out of the last
The reason this looks like a sequence of replacements is that the layers showed up one at a time, each solving the problem the previous one exposed.
In the early days you mostly typed into a single box, so the wording was everything. People discovered that small changes in phrasing produced wildly different output, and a whole craft of prompt engineering sprang up around it. It worked, until people started asking models to do bigger jobs that depended on real information — your codebase, your docs, a long back-and-forth. Then the bottleneck moved. It no longer mattered how cleverly you worded the question if the model couldn't see the thing the question was about. So attention shifted outward to context: retrieval, memory, picking the right files, showing the right examples. Context engineering was born from the limits of prompting, not as a rejection of it.

Then the same thing happened again. Once models could be handed rich context, people wanted them to act — run code, call tools, fix the failure, try again. A single well-stocked context isn't enough for that, because a real task is a loop: do something, see what happened, adjust, repeat, and don't ship anything broken. Building that loop — the tools, the retries, the sandboxing, the checks at the end — is harness engineering, and it grew out of the limits of context the same way context grew out of the limits of prompting. Three overlapping waves, each one standing on the shoulders of the last.
Notice that each wave didn't just add a new skill, it moved the bottleneck. While wording was the limiting factor, prompting was where all the effort went. Once wording was solved well enough, the thing holding you back became what the model could see, so effort flowed to context. Once context was rich enough, the new ceiling was what the model could do with it, and effort flowed to the harness. Progress in this field looks less like inventing brand-new abilities and more like the constraint quietly relocating, then everyone's attention chasing it to the next layer out. The waves overlap because the inner ones never stopped mattering — they just stopped being the part most worth obsessing over in that moment.
Where this is heading
For a while the three layers were built by different people at different times with different tools. That's ending. The interesting work now treats prompt, context, and harness as one integrated system, designed together. The harness decides what context to assemble; the context shapes how the prompt should read; the prompt's intent informs which tools the harness should reach for. Pull them apart and each one is weaker. The frontier is designing the whole agent as a single thing.

That convergence quietly changes your job. As the machinery gets better at assembling context and running the loop, the two things it can never do for you stand out more sharply. It can't invent what you actually want, and it can't tell you whether the result is genuinely right. So the human role drifts away from fiddling with wording and toward two durable skills: expressing intent clearly enough that the system has something true to build on, and verifying the outcome well enough to trust it. The middle layers automate; the ends — intent and judgment — stay with you.
This is reassuring once it clicks. The fear behind every "X engineering is dead" headline is that the skill you just learned is about to be worthless. But what actually happens is that the mechanical parts get automated and the human parts get more valuable, not less. Nobody pays you to remember the exact phrasing that coaxes a model into behaving — a good system handles that. They pay you to know what good looks like and to notice when the output isn't it. As the tooling absorbs more of the middle, being clear about intent and sharp about judgment becomes the whole game. Those happen to be the two skills that transfer to every new model, every new harness, every next wave nobody has named yet.
Which is why the innermost layer still earns its keep. A clear, complete request is what intent looks like when it enters the system, and it's the one input no harness can supply on your behalf. Crafting it by hand, every time, against a coding agent that deserves a good instruction, is tedious — and it's exactly what AgentForge does for you, turning a rough request into a clear one refined against more than 1,000 real coding cases. The waves keep stacking; the request at the center of them is still worth getting right.