Power · 1 of 6

Why is power the limit on AI scaling?

For sixty years, compute scaled by shrinking transistors. The next decade scales by adding gigawatts. Different physics, different lead times, different bottleneck.

Where the binding constraint sits today

The binding constraint on frontier AI in 2026 is not transistors and not even chip supply. It is the kilometres of high-voltage wire and the substations that connect a 5 GW load to the grid. That part of the stack moves on a 5–15 year clock.

The bottleneck has moved

For most of the modern compute era, the binding constraint on AI was inside the chip. More transistors, smaller nodes, faster clocks. Engineers spent careers fighting heat, leakage, and lithography limits one die at a time. Power was a budget you wrote on a spreadsheet.

That changed around 2023. Frontier training runs jumped from tens of megawatts to hundreds. Hyperscalers stopped optimising for cost-per-FLOP and started bidding for entire substations. By 2026, every serious AI buildout is led by a power and real-estate team, not a chip team. The chip is something you order — assuming you can plug it in.

The reason the bottleneck moved is asymmetric improvement rates. Chips roughly double in performance per watt every 18–24 months. Cluster sizes — the number of chips you wire into one training run — are growing 4–10× per generation. The product of those two numbers is the cluster's power draw, and it is now growing faster than the rate at which any electric grid in the world has ever expanded.

What changed in the last four years

In 2022, AI compute was a rounding error on the global grid. By 2026, AI-dedicated compute consumes roughly 3% of US electricity and is doubling every two years. Microsoft, Meta, Google, and Amazon are collectively committing more than $300B in 2026 capex, and the line item growing fastest is power and real estate.

~0.5%

AI share of global electricity in 2022

~3%

AI share of US electricity in 2026

~1.2 GW

Largest single training cluster (Memphis-class)

5–10 GW

Per-cluster scale on the 2030 roadmap

Why “just buy more chips” doesn’t solve it

The natural intuition — if there is a power problem, buy denser chips and use less of them — runs into two compounding facts. First, each generation of frontier chip uses more power than the last, not less. An H100 draws 700W. A B200 draws 1,200W. The next generation draws more still. The performance-per-watt is improving, but absolute power per chip is rising.

Second, training-run efficiency falls off the bigger the cluster gets. Doubling your chip count does not double your usable compute; communication overhead, failures, and synchronisation losses eat 10–30%. So the path to a more capable model is more chips, drawn at higher power, in a tighter physical footprint, with a more demanding cooling and networking spec. Every variable points at the wall.

The vocabulary you need

Power is a rate — energy per unit time — measured in watts. The terms you’ll see in every AI buildout discussion stack like this:

Watt (W) — a phone charger draws ~5 W. A laptop, ~30 W.
Kilowatt (kW) — a home, ~1 kW continuous. An EV fast-charger, ~150 kW.
Megawatt (MW) — a small data center, ~10 MW. A large industrial site, ~100 MW.
Gigawatt (GW) — a nuclear reactor, ~1 GW. A frontier training cluster in 2026, ~1 GW.
Terawatt (TW) — total US electric grid, ~0.5 TW. Global grid, ~3.2 TW.

The frontier cluster of 2030 is a 5–10 GW load. That is roughly the entire electric demand of a US state like Connecticut, dropped onto a single site, with the expectation that it will run flat-out 24/7.

What this implies for AI strategy

If you are betting on AI through company exposure, capex direction, or your own product roadmap, the consequences of the power bottleneck are concrete:

Site selection drives the roadmap. Whoever controls siteable, interconnectable power for the 2027–2030 vintage of training will have a meaningful capability edge. Not a marginal cost edge — a capability edge, because they can train models others physically cannot.
Procurement is the new architecture choice. Behind-the-meter generation, on-site fuel cells, and SMR offtake agreements are now strategic decisions on par with chip selection.
The grid is the slowest variable in the system. Lab capacity, capital, regulatory will, and chip supply can all flex faster than transmission.
The bottleneck will rotate again. Once power and wires are unblocked — by gas, nuclear, behind-the-meter builds, fast-tracked transmission — the binding constraint will move back to chips, then to model architecture, then to applications. This is the loop. Knowing where it sits today is the first job.

The bottleneck has moved

What changed in the last four years

~0.5%

AI share of global electricity in 2022

~3%

AI share of US electricity in 2026

~1.2 GW

Largest single training cluster (Memphis-class)

5–10 GW

Per-cluster scale on the 2030 roadmap

Why “just buy more chips” doesn’t solve it

The vocabulary you need

Power is a rate — energy per unit time — measured in watts. The terms you’ll see in every AI buildout discussion stack like this:

Watt (W) — a phone charger draws ~5 W. A laptop, ~30 W.

Kilowatt (kW) — a home, ~1 kW continuous. An EV fast-charger, ~150 kW.

Megawatt (MW) — a small data center, ~10 MW. A large industrial site, ~100 MW.

Gigawatt (GW) — a nuclear reactor, ~1 GW. A frontier training cluster in 2026, ~1 GW.

Terawatt (TW) — total US electric grid, ~0.5 TW. Global grid, ~3.2 TW.

What this implies for AI strategy

If you are betting on AI through company exposure, capex direction, or your own product roadmap, the consequences of the power bottleneck are concrete:

Site selection drives the roadmap. Whoever controls siteable, interconnectable power for the 2027–2030 vintage of training will have a meaningful capability edge. Not a marginal cost edge — a capability edge, because they can train models others physically cannot.

Procurement is the new architecture choice. Behind-the-meter generation, on-site fuel cells, and SMR offtake agreements are now strategic decisions on par with chip selection.

The grid is the slowest variable in the system. Lab capacity, capital, regulatory will, and chip supply can all flex faster than transmission.

The bottleneck will rotate again. Once power and wires are unblocked — by gas, nuclear, behind-the-meter builds, fast-tracked transmission — the binding constraint will move back to chips, then to model architecture, then to applications. This is the loop. Knowing where it sits today is the first job.