Bidding for Brains

2026-05-05

Right now, AI is a bargain. Compared to an employee, it's not even close. A senior-level workflow you'd have paid a six-figure salary for last year now runs at the cost of a month's coffee budget. The math is so lopsided it doesn't feel real yet. Companies that have noticed are already restructuring quietly. The rest are about to find out.

The trajectory is simple, and it doesn't require any of the philosophical debates that dominate AI Twitter. The system doesn't have to be AGI. It doesn't have to be conscious. It doesn't have to dream. It just has to keep getting better, and it is. Every quarter, the floor rises. Every model release, the ceiling moves. Within three to five years, the output quality of an off-the-shelf inference call will exceed the median employee at most desk jobs in most companies. At that point, the question isn't whether to use it. The question is how much to spend on it.

The first wave of layoffs has already started. Midwits and juniors get cut first, because their work is the easiest to automate and the lowest-hanging on the cost curve. What remains is a thin layer of staff-level operators wired up to superintelligence — humans who can direct, judge, and validate the output of an inference cluster running at a thousand tokens per second. The org chart inverts. The pyramid flattens. A few well-positioned humans steering an enormous compute budget will outproduce the entire mid-layer of the old company.

The pricing tiers are already forming. Twenty dollars a month gets you the consumer model. Two hundred dollars gets you the professional tier. Five hundred gets you the model that's twenty percent better, on the benchmark that matters in your domain. A thousand gets you maximum context, maximum reasoning, full agentic. Five thousand gets you the same thing at twice the speed. Fifteen thousand gets you the mega-tier — ten times faster, full bandwidth, parallel inference, no rate limits. Thirty thousand gets you whatever the lab quietly offers Fortune 500 customers under NDA. Each step up is dramatically more capable, dramatically more expensive, and dramatically more decisive in the market.

This is the part that flips the entire frame. At the high end, the inference cost per month will exceed the human cost per month. Fifteen thousand for the human operator, thirty-five thousand for the inference budget that operator is steering. Twelve thousand for the staff engineer, fifty thousand for the agent fleet they're running. The line item people used to call "headcount" becomes a small, almost rounding-error column on the spreadsheet. The big number sits next to it, labeled "compute" or "inference" or "tokens" or whatever this quarter's accounting fashion calls it. It looks like server cost. It looks like office rent. It is now the dominant cost in the business.

And they cannot afford not to pay it. This is the trap. The same way a startup couldn't afford not to hire senior engineers last decade — because their competitor was hiring them, and code quality compounds — a company today cannot afford to skimp on the inference tier. Whoever has more compute moves faster, ships better, iterates harder, and outflanks the rest. The capital allocation question for VCs has narrowed to a single variable: how much inference can this team buy, and how well can they direct it. Everything else is downstream of that number.

The arms race dynamic is unforgiving. How much more capability do you actually need? For a human user, not much — the consumer tier is already shockingly capable. But the question stops being absolute and becomes relative the moment money is on the line. In an F1 race, the question isn't "how much horsepower is enough." The question is "just enough to beat the car next to me." Five percent more compute. Five percent better reasoning. Five percent faster iteration. The competitor is running thirty grand a month in inference. So you have to run thirty-two. Then they run thirty-five. Then you run forty. The escalation is automatic. Nobody chooses to bid more — everyone is forced to.

Humans don't disappear. They're still required. But their role narrows to the parts of the stack that can't be rented yet — vision, taste, ideas, relationships, networks, judgment. The human supplies the bet; the inference supplies the execution. This is exactly how corporations hired senior talent for decades — they needed brains to extract skills, to translate ambition into product. The pattern is the same. The medium has changed. Today the brains are still partly human, but most of the skills are about to be on tap, by the gigawatt, billed monthly, with overage pricing.

The corporation becomes, functionally, a wrapper around an inference budget. Humans still exist inside it — but they're now the lowest-cost component. A handful of operators, narrowly chosen for judgment and taste, sit at the top of a stack that's mostly compute. The headcount column on the balance sheet shrinks. The inference column explodes. HR shrinks. Procurement grows. The CFO becomes, in practice, a token-cost analyst with a fancier title.

This is the new caste system, and it's purely financial. The lab tier — companies that own the compute — sits at the top, because they print intelligence and rent it out. The bandwidth tier — companies that can afford the maximum inference — sits just below, because they consume the most genius per dollar. The starvation tier — companies operating at consumer pricing — limps along, doing yesterday's work at yesterday's speed, getting outshipped on every metric that matters. There is no intermediate position. There is no "scrappy underdog with a great team." The team doesn't matter if the team can't afford the tokens.

The implication at the human layer is brutal. The best individual engineer in the world, working with a twenty-dollar consumer subscription, will be out-iterated daily by a competent operator running thirty thousand a month of inference. Talent stops being the binding constraint. Capital becomes the binding constraint. The artist matters less than the budget. The soldier matters less than the ammunition. And the firms with the deepest pockets get to make the most reality.

The old story said headcount was the cost. The new story says inference is the cost, and headcount is the floor — a small, irreducible expense for the few humans needed to point the compute in the right direction. Most of the building, most of the writing, most of the analysis, most of the iteration will be done by tokens. The companies that win are the ones that figure out how to bid most aggressively for those tokens, and how to deploy a small, high-judgment crew of humans on top of them without slowing them down.

Everything is becoming a bidding war for brains. Just not human brains. The brains are rented from a handful of labs, by the hour, by the token, by the gigawatt — and whoever has the wallet to bid the highest gets to ship the future first.

← Back to index