FOMO is why enterprises pay for GPUs they don't use — and why prices keep climbing

VentureBeat AI · Apr 29, 2026, 3:12 PM · Also reported by 1 other source

Why this matters: a development in AI with implications for how people work, create, and decide.

Enterprises can't fix their GPU waste problem because the fix makes the problem worse. Releasing idle capacity would improve utilization, but the same shortage driving GPU prices up is exactly why no team will give capacity back. So the fleet sits at roughly 5%, billed by the hour, and the cycle tightens.That pressure — repeated across thousands of enterprises over the past two years — is the reason most companies are now running their GPU fleets at roughly 5% utilization, according to Cast AI's 2026 State of Kubernetes Optimization Report, which measured actual production clusters rather than surveying them. It's also the reason nobody releases the idle capacity. Cast AI co-founder and President Laurent Gil has been tracking the dynamic for two years. “Many of the neoclouds are not cloud,” he told VentureBeat. “They are neo-real estate.”Five percent is about six times worse than a no-effort baseline. Gil puts a reasonable human-managed target at around 30% once you factor in day cycles, weekends and normal business patterns. Five percent means enterprises are running their most expensive infrastructure line at a fraction of what doing nothing intentional would yield. And it lands at the same moment cloud compute pricing has broken its 20-year pattern. AWS quietly raised its reserved H200 GPU prices by roughly 15% on a Saturday in January, with no formal announcement. Memory suppliers pushed HBM3e prices up 20% for 2026. It is the first time since AWS launched EC2 in 2006 that a hyperscaler has meaningfully raised reserved GPU pricing rather than cut it. For now, the assumption under most enterprise AI budgets — that cloud compute gets cheaper every year— no longer holds at the top of the stack.The cloud market has split in twoThe pricing move matters less for what it is than for what it signals about where the shortage actually bites. Cloud compute has split into two layers. At the commodity layer, the old deflation still works. H100 on-demand pricin

Article preview — originally published by VentureBeat AI. Full story at the source.

Read full story on VentureBeat AI → More top stories

Also covered by

ARY News Spirit Airlines shuts down as company says it can’t keep up with higher oil prices

Aggregated and edited by the Scoop newsroom. We surface news from VentureBeat AI alongside other reporting so you can compare coverage in one place. Editorial policy · Corrections · About Scoop

FOMO is why enterprises pay for GPUs they don't use — and why prices keep climbing

Also covered by

More in ai