Avoiding Death on the Yellow Brick Road

A a16z.news ↗

▲ 18 points • 4 comments • by ex-aws-dude • 3w ago • HN discussion ↗

Pangram verdict · v3.3

We believe that this document is primarily human-written, with some AI-generated content detected

28 %

AI likelihood · overall

Mixed

73% human-written 27% AI-generated

SEGMENTS · HUMAN 4 of 5

SEGMENTS · AI 1 of 5

WORD COUNT 1,594

PEAK AI % 89% · §4

Analyzed

May 31

backend: pangram/v3.3

Segments scanned

5 windows

avg 319 words each

Distribution

73 / 27%

human / AI fraction

Verdict

Mixed

Pangram v3.3

Article text · 1,594 words · 5 segments analyzed

Human AI-generated

§1 Human · 0%

America | Tech | Opinion | Culture | ChartsDavid Haber recently sat down with Apollo’s Marc Rowan on the a16z Podcast—check out their conversation here. - ADThe question I keep getting from founders and prospective employees: is there any AI application layer left to build, or are OpenAI and Anthropic going to kill everything?There’s a particular flavor of AI psychosis behind the question. Some people have concluded the only durable places to avoid the permanent underclass are inside a big lab or out on the frontier building in robotics, hardtech, or similar – theoretically anything “the labs can’t touch.” If every piece of software is about to be eaten, either by Codex or Claude absorbing the work directly, or by a future model that will make whatever you’ve built unnecessary, then run!Listen I’m as much of an AI maximalist as almost anyone, and I think they’re half right. The labs really are coming for a huge swath of the application surface. But “the application layer” isn’t just one homogenous opportunity. The right framing is whether you’re on the Yellow Brick Road or somewhere else in Oz.The Yellow Brick Road is our shorthand for the path the labs are walking, where they’re committing extraordinary resources. The reason the labs are best-suited for problems like code generation, writing, or image-creation is because these problems improve with raw model capability: every dollar spent on pre-training and post-training improves product quality. Meanwhile, the rest of Oz is inhabited by more complex, often vertical problems, that aren’t as simple as giving a business user a horizontal tool with access to standard tools and computer use. The value comes less from the underlying model’s raw capability (though that’s still important!) than from the scaffolding around it that makes the output trustworthy, compliant, and operational inside a specific industry.We’re seeing this play out in real time as OpenAI and Anthropic are effectively telling the market they can’t solve every problem with a generic AI coworker. They’ve announced massive forward-deployed joint ventures to build whole companies around configuring and customizing their models for the enterprise. You don’t pour billions into those programs if you think the next model release is going to take care of it.So if you want to get rich building AI apps – avoid the yellow brick road and build somewhere else in Oz.

§2 Human · 0%

Here’s what we’ve learned, and what some of our portfolio founders have learned, about what works.If you’re starting a company, The Yellow Brick Road is the most obvious path to go down, but it’s the most dangerous. Take a high performing model, plug in some off-the-shelf connectors (like G Drive, Slack, Salesforce, Notion, GitHub), and ship some sort of agentic orchestration layer on top of that. Magic!The problem with this is that this is what the labs are doing with Cowork and Codex. Obviously, they own the model, which gives them better margins, control, and the ability to exert pricing power on anyone who’s downstream from them. But maybe most importantly also own the architectural choices that define what their products are built to solve well. They’ve been deliberate so far about the model plus tool calls pattern, and this is exactly what horizontal low-step-count work on the road requires. Even if a startup could somehow outperform Codex or Claude Code, the labs have massive distribution arms and the biggest brand halo in AI.If you’re an AI app company running that playbook with the same connectors, no sub-agents or configuration below it, and no distribution, you’re likely walking down the road to nowhere.It’s not all doom and gloom for startups. There’s an enormous opportunity outside the Yellow Brick Road, where startups have a clear path to own their customer and solve complex problems.These businesses are building agentic experiences where the model is woven through a complex web of tools, automations, and integrations (read: software), leading most of these startups to be vertical by default. They can focus on multi-step and multi-player work, with sub-agents for role- and vertical-specific tasks, that Anthropic and OpenAI can’t reach with horizontal platforms: gathering context across systems, then routing through multiple humans who have to approve at different stages. It often involves one or more legacy systems, tends toward needing deterministic outcomes where ambiguity isn’t acceptable, and is at times tied to some valuable business outcome. The labs understand how valuable these problems are: that’s why they’re building their own outsourced configuration shops, and why an entire upmarket class of reinforcement learning businesses exist.The response to the above would be that to date, it’s been a pretty bad trade to bet against the models/labs improving.

§3 Human · 12%

They’ll likely just keep getting better and eventually eat into the market served by these application layer businesses.The labs will certainly improve, but I’d argue there are a few ways the rest of Oz can defend themselves over time:Data and learning flywheels:A lot of what you internalize isn’t in any training set — unwritten industry norms, undocumented standards, the tribal knowledge that lives in practitioners’ heads. None of it is on the public web. No amount of training compute substitutes for being inside the workflows where this knowledge actually lives. There are two flywheels stacked on top of each other here: an across-customer one — patterns that compound as you see more variants of the same problem — and a within-customer one — the why behind specific decisions, the unsaid exceptions, the firm’s own rules of thumb that only surface through real interaction with the system.Even if customer data can’t be used across customers, application companies will be able to leverage pattern recognition across customer problem types, and use that to inform the right architecture for future problems.

§4 AI · 89%

A company that has run its agents through a hundred legal redlines, a thousand insurance underwriting cycles, or ten thousand SDR campaigns has internalized the shape of the problem in a way the next entrant cannot replicate by spinning up a fresh agent for the first time.A horizontal agent could in principle build the same learning infrastructure. The reason it doesn’t, beyond pure focus, is UX: capturing this kind of knowledge depends entirely on the workflow surfaces you give the user, and vertical players can shape those surfaces around exactly what their workflow needs to surface. Horizontal tools can’t. Eval sets, labeled outputs, and edge-case taxonomies can compound into a vertical-specific data flywheel which can fuel fine-tuning the next entrant can’t generate without comparable production exposure. Whether this is possible depends on data rights, the volume of production exposure accumulated, and the structure of customer contracts, but pattern recognition accrues regardless.Managing model variability and complexity: The labs are already routing internally — different model classes for different requests, ensembles under the hood. What they can’t do is route across vendors, or evaluate a competitor’s model for a specific sub-task, or use an open-source fine-tune for the narrow piece where it’s actually best. The Rest of Oz company picks the right model for each sub-task across the entire model market, not just what its parent lab ships. It also does the work nobody wants to do — re-running evals on upgrades, recalibrating prompts for the customer’s edge cases, rolling out without breaking production — every time a new model lands. The labs aren’t doing this on the customer’s behalf; they sell you their next model and tell you to migrate. The Rest of Oz company absorbs the migration. What the customer gets is the best intelligence available across the whole market, plus continuity through every upgrade.Cost optimization: Running every query through Opus 4.7 is the fastest path to negative gross margins. The best Rest of Oz companies route across tiers of models — frontier models for the hardest tasks, mid-tier for the bulk, smaller custom or fine-tuned models where they’ve earned the right to use them. Some are now post-training their own models on top of that, optimizing them for the narrow slice of work their customer cares about and serving them at a fraction of the cost of a frontier API call.

§5 Human · 20%

The labs price the floor: the least intelligence available at $X. The Rest of Oz company sells the inverse — the lowest dollar cost for the specific level of intelligence the workflow actually requires. That’s only possible if you know exactly what level each sub-task needs, which the labs structurally can’t know across every vertical. It translates directly into lower, controlled prices for outcomes.Governance: There is considerable value in becoming the control plane for how their customers run AI in that vertical – the place where permissions, auditing, what-the-agent-is-allowed-to-do, and what-the-agent-actually-did all converge. That control plane is built out of use case specific guardrails that look completely different across industries and job types. Because they own the tools, the workflows, and the data the agent touches end-to-end, they can provide deterministic outcomes in ways horizontal tools will struggle to. They are also the entity that absorbs the regulatory complexity for the end buyer — FRCP and bar rules in legal, HIPAA in healthcare, SEC and FINRA in finance, state insurance regulations, and so on. A horizontal player can’t credibly do that without becoming a hundred different verticals at once. CIOs want to have a partner that contractually states they are handling compliance for the agents they are providing.All of these come back to the same thing: focus. That could be a vertical (insurance, legal, accounting) or a function done deeply (sales, customer support, finance). Either way, the work needs a team that’s heads-down on one customer set — its workflows, its edge cases, its regulations.