Why is power the limit on AI scaling?
For sixty years, compute scaled by shrinking transistors. The next decade scales by adding gigawatts. Different physics, different lead times, different bottleneck.
The binding constraint on frontier AI in 2026 is not transistors and not even chip supply. It is the kilometres of high-voltage wire and the substations that connect a 5 GW load to the grid. That part of the stack moves on a 5–15 year clock.
The bottleneck has moved
For most of the modern compute era, the binding constraint on AI was inside the chip. More transistors, smaller nodes, faster clocks. Engineers spent careers fighting heat, leakage, and lithography limits one die at a time. Power was a budget you wrote on a spreadsheet.
That changed around 2023. Frontier training runs jumped from tens of megawatts to hundreds. Hyperscalers stopped optimising for cost-per-FLOP and started bidding for entire substations. By 2026, every serious AI buildout is led by a power and real-estate team, not a chip team. The chip is something you order — assuming you can plug it in.
The reason the bottleneck moved is asymmetric improvement rates. Chips roughly double in performance per watt every 18–24 months. Cluster sizes — the number of chips you wire into one training run — are growing 4–10× per generation. The product of those two numbers is the cluster's power draw, and it is now growing faster than the rate at which any electric grid in the world has ever expanded.
What changed in the last four years
In 2022, AI compute was a rounding error on the global grid. By 2026, AI-dedicated compute consumes roughly 3% of US electricity and is doubling every two years. Microsoft, Meta, Google, and Amazon are collectively committing more than $300B in 2026 capex, and the line item growing fastest is power and real estate.
Why “just buy more chips” doesn’t solve it
The natural intuition — if there is a power problem, buy denser chips and use less of them — runs into two compounding facts. First, each generation of frontier chip uses more power than the last, not less. An H100 draws 700W. A B200 draws 1,200W. The next generation draws more still. The performance-per-watt is improving, but absolute power per chip is rising.
Second, training-run efficiency falls off the bigger the cluster gets. Doubling your chip count does not double your usable compute; communication overhead, failures, and synchronisation losses eat 10–30%. So the path to a more capable model is more chips, drawn at higher power, in a tighter physical footprint, with a more demanding cooling and networking spec. Every variable points at the wall.
The vocabulary you need
Power is a rate — energy per unit time — measured in watts. The terms you’ll see in every AI buildout discussion stack like this:
- Watt (W) — a phone charger draws ~5 W. A laptop, ~30 W.
- Kilowatt (kW) — a home, ~1 kW continuous. An EV fast-charger, ~150 kW.
- Megawatt (MW) — a small data center, ~10 MW. A large industrial site, ~100 MW.
- Gigawatt (GW) — a nuclear reactor, ~1 GW. A frontier training cluster in 2026, ~1 GW.
- Terawatt (TW) — total US electric grid, ~0.5 TW. Global grid, ~3.2 TW.
The frontier cluster of 2030 is a 5–10 GW load. That is roughly the entire electric demand of a US state like Connecticut, dropped onto a single site, with the expectation that it will run flat-out 24/7.
What this implies for AI strategy
If you are betting on AI through company exposure, capex direction, or your own product roadmap, the consequences of the power bottleneck are concrete:
- Site selection drives the roadmap. Whoever controls siteable, interconnectable power for the 2027–2030 vintage of training will have a meaningful capability edge. Not a marginal cost edge — a capability edge, because they can train models others physically cannot.
- Procurement is the new architecture choice. Behind-the-meter generation, on-site fuel cells, and SMR offtake agreements are now strategic decisions on par with chip selection.
- The grid is the slowest variable in the system. Lab capacity, capital, regulatory will, and chip supply can all flex faster than transmission.
- The bottleneck will rotate again. Once power and wires are unblocked — by gas, nuclear, behind-the-meter builds, fast-tracked transmission — the binding constraint will move back to chips, then to model architecture, then to applications. This is the loop. Knowing where it sits today is the first job.