
Google limits Meta's use of Gemini AI models, exposing the industry's deep compute shortage
The restriction, disclosed by the Financial Times, disrupted internal Meta projects and forced the social media giant to urge staff to use AI tokens more efficiently.
The cap and its immediate fallout
Google told Meta around March that it could not provide the full Gemini AI capacity the social media company wanted to purchase, according to three people familiar with the matter cited by the Financial Times. The shortfall disrupted and delayed some of Meta's internal AI projects, and the restriction remains in place. Other Google Cloud clients have also been affected by the compute constraints, though to a lesser extent. Meta was hit particularly hard because of its exceptionally high demand for Google's models.
Meta's response
As a result of the restrictions and a broader push to streamline AI costs, Meta has encouraged staff to be more efficient with AI tokens, the units that measure model usage. The company had originally leaned on Gemini for safety processes like removing harmful content and tackling scams, but it has been shifting workloads to Muse Spark, a new internal model developed under its Superintelligence Labs division. Those internal moves accelerated after Meta cut 8,000 jobs in May, reassigned 7,000 workers to AI-focused roles, and set capital expenditure guidance of $115 to $135 billion for 2026.
Google's own capacity scramble
Google Cloud revenue exceeded $20 billion in the first quarter, but CEO Sundar Pichai said near-term compute constraints prevented even higher growth and contributed to the unit's backlog nearly doubling quarter on quarter to more than $460 billion. The strain pushed Google to sign a $920-million-a-month deal earlier in June to lease computing capacity from Elon Musk's SpaceX, calling it "bridge capacity" to meet surging demand for Gemini Enterprise.Our Cloud revenue would have been higher if we were able to meet the demand.
A wider industry pattern
The episode offers a rare glimpse into infrastructure bottlenecks that even the largest tech companies cannot outspend. Despite tens of billions of dollars flowing into chips, data centres and power, demand for AI inference workloads is growing faster than supply. Google, spending over $180 billion on capital expenditure this year and still rationing access to a customer as large as Meta while simultaneously renting GPUs from a rocket company, is the clearest signal yet that AI infrastructure buildouts have not kept pace with consumption.
- Google tells Meta it cannot provide full Gemini capacity, disrupting some internal Meta AI projects
- Google Cloud revenue tops $20bn in Q1; CEO Pichai says compute constraints limited growth
- Meta cuts 8,000 jobs, reassigns 7,000 workers to AI, and launches internal model Muse Spark
- Google signs $920mn/month deal with SpaceX for 110,000 Nvidia GPUs as bridge capacity
What comes next
For Meta, the Gemini cap accelerates a transition it was already pursuing: moving from external frontier models to internal alternatives capable of handling critical workloads at scale. The broader industry faces the same structural tension, with revenue growth and model deployment both limited by physical compute ceilings that even record spending has been unable to lift fast enough.
- Revenue
- 20 $bn
- Backlog
- 460 $bn


