“Our IQs are so low that we’re actually using [AI tools] to check out the recipe for, you know, French toast. That’s where you’re seeing the prices go up.”
John Zito, co-president of Apollo Asset Management, sat for a fireside chat at the Morgan Stanley US Financials Conference on Wednesday, where he suggested to Bloomberg that measured per unit of intelligence delivered rather than per token, prices are collapsing – even as low-value usage drives the actual bills up.
“I think tokenmaxxing and token talk is – it’s a lot of BS, honestly. Like, if you look at per unit of knowledge and cost per unit of knowledge, prices are collapsing. Prices are collapsing per unit of IQ, if you did it that way.”
In other words: a token is not a unit of intelligence. Price the capability instead of the throughput – the way a 2026 laptop costs what a 2010 laptop did but is 50x more capable – and the cost of “IQ” is in freefall even while the bills explode.
He also suggested that we’re screwed if AI isn’t just hype:
“If AI is real, it’s so hyper-deflationary to so many things over the long term that it’s really hard to take risk.“
So a few things are going on here – the spending problem is real. The metric everyone is using to describe it is wrong. And the resolution of that tension is where the entire AI trade goes next.
As Goldman’s Rich Privorotsky noted five days earlier – consensus was already migrating to exactly this frame; that the relevant economic metric is not token volume but useful task completion per watt and per dollar; that customers facing usage-based pricing “will optimize for cost per completed task,” routing simple work to local models, harder tasks to the cloud, and frontier models only when required; and, that we “maybe have allocated too much spend to the Data-Centric model.” When Apollo and Goldman independently land on the same framework inside a week, we’re looking at a new institutional consensus forming in real time.
The French toast economy
Zito’s diagnosis of why enterprise AI bills exploded will sound familiar to ZeroHedge premium subscribers: too many companies pointing frontier models at tasks that don’t remotely justify the compute.
“Our IQs are so low that we’re actually using [AI tools] to check out the recipe for, you know, French toast. That’s where you’re seeing the prices go up.”
Swap “French toast” for “checking the weather” and that is, almost verbatim, the tokenmaxxing reductio we documented at Amazon – employees routing busywork through agents to climb the KiroRank leaderboard, frontier reasoning models deployed against questions a search bar answered in 2009. Zito even joked that his own IQ is “not high enough” to need what Anthropic’s next flagship model – and something that only “a handful” of users genuinely need, and can monetize, the bleeding edge.
So – the mismatch between task and tool doesn’t persist forever – it gets arbitraged into what he called a new economy for the sector: “The AMD chip, the Nvidia chip, all these different chips will be used and optimized for a certain use-case to solve the spend problem.“ Citadel and Jane Street pay anything for the frontier because their ROI is, in his word, massive. Everyone else’s French toast queries get routed to something cheap.
And of course we watched this unfold over the last month. Bloomberg notes Uber set usage limits on tools like Claude Code after incinerating its AI budget, and Walmart capped an in-house AI agent – the one that helps employees with spreadsheets and presentations – after demand ran too hot. The caps are landing on exactly the low-IQ-task tier Zito is describing, while the frontier spend stays untouched.
How we got here
For readers just joining: this is the latest beat in a story that has moved very fast.
Last month we noted that the AI narrative had hit a serious snag, after Uber’s COO Andrew Macdonald admitted the company couldn’t draw a line between exploding token consumption and useful product output. This, after 5,000 engineers burned the entire 2026 AI budget by April. Data spanning 2,444 companies suggested only 18 cents of every AI dollar reaches users as stable product, with 44 cents going to fixing bugs the AI itself introduced.

Then came the $500 million mystery bill – an unnamed enterprise client, per Axios, torching half a billion dollars on Claude in a single month with no usage caps – landing the same week Amazon nuked its internal leaderboard and an SVP begged staff not to use AI for the sake of using AI.
Then, in Part II of our reporting: ‘From Singularity To Tokenomics,’ we noted that the subsidy formally ended: GitHub flipped Copilot to usage-based billing on June 1 – the same morning Anthropic confidentially filed its draft S-1 – and developers hit their monthly quotas before lunch. OpenAI, Google, and Microsoft all executed the same flat-rate-to-meter pivot within sixty days of each other. Sam Altman conceded that cost went from a non-issue in January to, in his words, a huge issue and a meme.
By Monday, Goldman’s one-delta desk was flagging that the Silicon Data Token Spending Index had started to soften – Q1 may have been peak token-maxxing-as-KPI – and Citrini Research had coined the inevitable sequel: in a matter of weeks, the narrative went from tokenmaxxing to tokenpanic.
We’re happy everyone is now looking at this chart… You’re welcome?

What enterprise customers are actually saying
Fresh comments from UBS paint an interesting picture. After polling actual IT execs at enterprise AI customers, the bank reports that token costs have become a real issue for roughly 60% of the enterprises they spoke to – “this is not a made-up media story,” in the bank’s own words. One customer described the GitHub Copilot pricing change in a single word: “chaos.” Another got their first AI bill and heard leadership say, flatly, “we don’t have the money for this.” A third admitted: “we overbuilt in certain areas and are starting to feel the wrath.”

But bears should pay attention to this part: not a single check was slamming on the AI brakes. UBS found the dominant behavior is guardrails, not retreat: caps, alerts, model-downshifting, pooled tokens – normal enterprise cost-containment. Several customers explicitly refused to throttle usage (“we don’t want to throttle them… our aim is to just get our employees to start using AI”) and are instead cannibalizing other IT spend – cutting external IT services, consolidating cloud, and notably, metering headcount growth – to make room for the AI line item. Even Uber, the poster child for budget incineration, has set per-engineer token caps around $1,500 a month – which, as UBS dryly notes, is still extremely high – and its CEO describes the company as full steam ahead.
So according to UBS, costs have been spiking because adoption is ramping, not because per-unit prices are inflating – the per-unit cost of intelligence is falling. Which is exactly Zito’s “cost per unit of IQ” point.
Both things are true
So is tokenmaxxing “a lot of BS”? Zito is right about the denominator. Cost per unit of intelligence is collapsing, relentlessly. Open-source and Chinese models deliver near-frontier capability at 10-25x lower cost; Cursor’s new model matches frontier coding performance at a tenth the price per task. Measured per unit of IQ, this is the most deflationary technology in living memory.

The CFOs are right about the numerator. Gartner has found that even a 90% collapse in inference costs won’t make enterprise AI cheaper, because agents devour tokens faster than prices drop and providers don’t fully pass the savings through. It is also the lived experience of every company in the UBS checks. A collapsing unit cost times an exploding unit count is still a bigger bill.
Token expenditure is a meaningless productivity metric and a decisive revenue metric. Nobody underwriting a near-trillion-dollar AI IPO can dismiss it as a measure of revenue durability. That is why the rollover in the Silicon Data index is worth watching: the chart measures nothing about value created, and everything about the thing the valuations are built on.
The same logic applies to the narrative itself. Zito calls the token talk noise; that noise doubled the market value of the semiconductor industry in two months on the way up and is unwinding it now. A fundamental investor can dismiss what a narrative measures. A trader cannot dismiss what it moves.

The unresolved question for traders: The infrastructure complex is priced for token demand going up and to the right at the frontier – Goldman’s 24x by 2030. But if Zito’s use-case economy arrives, a large share of that volume migrates to commodity inference: cheap chips, open models, local hardware. Volume can keep growing while the dollars – and the margins – pool somewhere other than where today’s valuations assume. The UBS checks already show the mechanism in motion: enterprises aren’t cutting AI, they’re cutting around AI, and routing down-market wherever good enough will do.
Meanwhile, the token expenditure index printed its sixth straight down day – the longest streak since January – with Citadel’s read attributing the drop to adoption becoming “less about what frontier models can do and more about the price,” a shift toward cheaper models. Note what that means for the chart: six red days on an expenditure index isn’t necessarily usage falling – it may be the deflation itself arriving in the spend line, the same work bought cheaper. The numerator and the denominator, colliding in one print. And savor the garnish: Citadel is one of the two firms Zito named as gladly paying anything for the frontier – and it’s their desk narrating everyone else trading down.
Volume and value have decoupled. The desks have noticed. The repricing is the part that comes next.
Source: Zero Hedge

![]()






