Compute · 8 of 9

What is yield, and why does it set the price of compute?

Yield is the fraction of chips that come off the line working. It is the single most under-discussed number in AI economics: yield decides how many GPUs a wafer produces, which dies become an H100 versus an H800, and why a frontier accelerator costs what it does.

Where the binding constraint sits today

For the largest AI dies, defect density and die area are multiplying against each other. Yield, not transistor count, sets the marginal cost of compute at the frontier.

Yield is a probability, not an inspection step

Every wafer that enters a fab will pick up some number of defects during the hundreds of process steps it goes through: a particle of dust, a misaligned exposure, a contamination event, a metal void. The fab cannot eliminate them. It can only push their rate down.

A working chip is a die that lands somewhere on the wafer where the random pattern of defects happens to miss its active circuits. The probability of that happening shrinks fast as the die gets bigger. This is why "good yield" is a number between zero and one, not a quality-control verdict.

Defect density times die area sets the curve

A simple working model: the probability a die has zero killer defects falls roughly exponentially with the product of defect density and die area. Industry shorthand calls this the D₀ × A product. A small die at a mature node has a tiny D₀ × A and yields above 90 percent. A reticle-sized die at a brand-new node can yield 30 to 60 percent in the early ramp, climbing over months.

This is why bigger dies cost more than their area would suggest. An H100 die is roughly 814 mm². At a defect density of about 0.1 per cm² on a mature 5 nm-class process, the math points to a defect-free yield in the 30 to 50 percent range before any rescue. Half the dies are born broken.

~814 mm²

NVIDIA H100 die area, near the EUV reticle limit

~0.1/cm²

Mature defect density for leading-edge logic processes

Source: TSMC defect-density disclosures and industry yield modeling; Dylan Patel on SemiAnalysis, 2024-2025

Binning rescues the dies that almost worked

Most defects do not kill the whole die. They kill a streaming multiprocessor, a memory channel, or a fraction of cache. The fab tests every die and groups them by what survived. This is binning.

A single GPU design ships as multiple products. Dies with all SMs working become the flagship. Dies with a few broken SMs become the cut-down product. Dies with broken memory channels become a lower-bandwidth SKU. Dies with broken NVLink ports become the export-controlled version. The H100, H800, and H20 are not separate designs. They are the same silicon binned for what passed.

144 → 132 / 114

Full GH100 die has 144 SMs; H100 SXM ships 132 enabled, H100 PCIe 114 — the rest is harvest yield

3 SKUs

H100 / H800 / H20 carved out of one die family

Harvest yield is the number that matters

Defect-free yield gets the headlines. Harvest yield is the real economic number: the fraction of dies that ship as *some* sellable product after binning. Modern logic processes hit harvest yields above 80 percent even when defect-free yield is much lower, because the design was built to be binnable.

This is why a fab announcing "yield" without saying which kind is doing a slight of hand. A mature node may have 90 percent harvest yield and 50 percent defect-free yield at the same time. The first sells the wafer. The second sells the flagship.

The frontier squeezes the curve from both sides

AI is pushing two trends against yield at once. Die area keeps growing as chips bump against the reticle limit and then split into chiplets to escape it. Process nodes keep tightening, which raises defect density during the multi-year ramp before the recipe stabilizes.

The result is that frontier accelerators are produced at lower yields than the consumer chips on the same node. The cost-per-good-die does not fall the way Moore-era extrapolations assumed. This is one reason the price of frontier compute has not collapsed even as transistor counts have grown.

The strategic read

Read every chip launch through the yield lens. A new accelerator with a much larger die at a much newer node is being sold at a yield that the fab is still climbing. Volume targets, allocation priority, and price all bend around that curve. The chip exists. Whether it can be bought is a yield question.

And read every cut-down SKU as a confession. The H800 and H20 are not weaker products; they are evidence that the harvest yield needed somewhere to go. Export controls accelerated the productization but did not invent it.