Systemising an agent-agnostic harnesses

Tom & Jakub · Wed, 18 Feb 2026

What OpenAI's agent experiment teaches us about the real infrastructure problem, and why one repo is just the start.

Last week, OpenAI published a detailed engineering analysis on an internal product built using purely AI agents. Over a million lines of code, 1,500 pull requests, with zero manually typed code, using Codex agents over five months.

Birgitta Böckeler at Thoughtworks followed with concise commentary that's worth reading alongside it (shared below).

Both articles circle the same growing problem, and we think it's the most important insight in software engineering right now:

From the agent's point of view, anything it can't access in-context while running effectively doesn't exist.

Everything the OpenAI team built - their knowledge base, their architectural linters, their doc-gardening agents, their version-controlled plans - was engineered to solve a single problem: getting the right context to the agent at the right time. They call it context engineering. We've been calling it the same thing since we started building Ctx|.

But here's what the OpenAI post doesn't say loudly enough: they did all of this by hand, for one product, in one repo, with a small elite team who were also the engineers of the harness itself.

That's not a criticism, it’s just highlighting the reality that it’s a proof of concept - and beyond many organisations to prototype their way through it.

Now the real question begins..


What happens when you have 50 repos? Or 500?

Birgitta asks whether harnesses will become the new service templates - standardised starting points teams fork, evolve, and contribute back to. It's a great analogy. But service templates have a well-known problem: they fork immediately and drift permanently. This is no different from the eternal documentation drift and decay issue (which is now solved for APIs — appear.sh).

Each team shapes them to their context and they stop being templates at all.

Now add AI agents to that picture. Agents aren't reading one repo. They're touching multiple services, crossing domain boundaries, encountering years of accumulated decisions that live in Confluence pages, Slack threads, Linear tickets, and the heads of people who left two years ago.

The OpenAI team acknowledged it directly: "That Slack discussion that aligned the team on an architectural pattern? If it isn't discoverable to the agent, it's illegible."

At one repo, you can manually curate what goes in. At org scale, with not a handful of agents but 1000s, curation doesn't scale. You need infrastructure.


Systemising a harness

What OpenAI built is impressive engineering for a single context surface. What enterprises need is a context layer - something that sits behind every agent, across every repo, every tool, every domain - and makes institutional knowledge legible to agents automatically.

That's what we're building with Ctx|.

Interactive diagram available on desktop.

Where OpenAI manually maintained a knowledge base in a single repo, Ctx| builds a self-learning knowledge graph that ingests across your entire organisation: repos, monorepos, ADRs, dependencies, and the tools your teams use daily, be it Linear, Confluence, Jira, Slack, Datadog, and so on. It learns from agent interactions, promoting patterns that work and flagging drift.

Where OpenAI built custom linters and structural tests to enforce architectural constraints, Ctx| brings governance to the instruction hierarchy itself - AGENTS.md, skills, MCPs - versioned in git, reviewed in PRs, with promotion and demotion controls so the right rules reach the right agents at the right time.

Where OpenAI connected Codex to its own toolset, Ctx| connects every agent - Cursor, Claude Code, Copilot, custom workspaces - through a single MCP interface. A single entry point to your organisation’s full graph. Agent-agnostic.

The insight from OpenAI's experiment shows us that engineering is rapidly moving from writing code to designing environments, feedback loops, and control systems. Ctx| is infrastructure for exactly that shift - not for one product team running a five-month experiment, but for organisations deploying hundreds or thousands of agents across their entire estate.


What this changes about the harness conversation

Birgitta raises a question we find fascinating: will we converge on fewer tech stacks, fewer topologies, because they're easier to harness? Probably yes, at the codebase level. But at the organisational level, enterprises don't get to start from an empty git repository. They have decades of decisions, a myriad of stacks, and technical debt that would drown any static analysis tool.

As with Appear, our design principal is to meet organisations where they are. The knowledge graph ingests what exists. The instruction hierarchy layers on top. Agents get progressively better context as the graph learns, not because someone manually curated every AGENTS.md file across 300 repos.

This is also why OpenAI's framing of "repo-local, versioned artifacts" is necessary but not sufficient when we zoom out. For a single team, repo-local is correct. For an enterprise, the context surface extends across domains and tools. The graph is the critical connective tissue.


What we're taking from this moment

The OpenAI and Thoughtworks articles together mark something of a line in the sand. The serious practitioners are now converging on context engineering as the infrastructure problem. The "just write better prompts" or “prompt designer” phase is behind us. The "just maintain AGENTS.md" phase is already showing its limits as we now ask more and expect more of our agents. The fault tolerance for agents is reducing as it moves through its own hype cycle. What comes next is the infrastructure phase, and that's the phase we built Ctx| for.

We're two founders, Jakub and Tom, building this with a deep experience in dev tooling and working around enterprises and startups where the shift is happening in real time. If your team is hitting those problems - or planning for them - we'd love to talk.


Ctx| is being built by Tom & Jakub. It has an open-source core, so you can deploy within your own infrastructure or use our managed hosting.

Join the waitlist


References

Articles referenced in this post: