How did Nvidia almost die — and what did they bet on?
In 1996 Nvidia had thirty days of cash and a product nobody wanted. They survived by making one good chip (the RIVA 128) and one good business pivot (graphics for gamers, not workstations). A decade later they made a much harder bet — CUDA, a programming model for using GPUs as general compute, when nobody was asking for it. The bet took fifteen years to pay off. By 2024 it had made Nvidia the most valuable company in the world.
Some option bets only pay off if the company keeps funding them through years of looking foolish. The strategic question is which conditions make a firm capable of holding that conviction — and what the AI-era equivalents look like today, while the next round of similar bets is still in its uncomfortable phase.
Founding and near-death
Nvidia was founded in April 1993 by Jensen Huang, Chris Malachowsky, and Curtis Priem at a Denny's in San Jose. Huang was 30 years old, had been at LSI Logic for eight years, and brought the business plan. Malachowsky and Priem had worked together at Sun Microsystems on the GX graphics architecture. The three put in $40,000 each and raised $20 million from Sequoia and Sutter Hill, partly on the strength of Don Valentine of Sequoia telling Huang on the phone, 'If you lose my money, I'll kill you.'
The first chip, the NV1, shipped in 1995. It was technically ambitious — it used quadratic surfaces instead of the polygons everyone else was settling on — and a commercial failure. The polygon-based API that Microsoft was about to standardize on (Direct3D, released 1996) was incompatible with the NV1's approach. The NV2 was cancelled. By mid-1996 Nvidia was running on fumes, had laid off half its staff, and was reportedly six months from bankruptcy.
The rescue product was the RIVA 128, shipped in August 1997. It was a more conventional polygon-based 3D accelerator, built on a new architecture, and aimed at the consumer PC gaming market — not the workstation market where Nvidia's earlier products had failed. The chip sold well, gave Nvidia eighteen months of survival cash, and was followed by the RIVA TNT (1998), the GeForce 256 (1999, the first chip Nvidia called a 'GPU'), and the GeForce 2 (2000). Each generation took more market share. By 2002 Nvidia was a public company with $1.4 billion in revenue and the dominant supplier of consumer 3D graphics.
Source: Jensen Huang's various retellings, most reliably his 2023 Stanford GSB "View From The Top" talk and the 2024 *Acquired* podcast episode on Nvidia.
The CUDA bet
By 2005, Nvidia GPUs were the most parallel processors in any consumer device. A high-end GeForce had hundreds of small floating-point units running simultaneously to render pixels. A few academics — at Stanford, at the University of Illinois, at several physics labs — had started writing scientific code that exploited this parallelism, using awkward graphics APIs to run protein-folding simulations and fluid dynamics on consumer graphics cards. The performance was real: 10x to 100x speedups over CPUs on the workloads that fit the parallel structure.
Nvidia's response, beginning around 2004 and shipping in 2006-07, was a piece of software called CUDA (Compute Unified Device Architecture) plus a hardware redesign (the Tesla architecture, also 2006, no relation to the car company) that made the GPU's compute units explicitly usable for non-graphics work. CUDA exposed the GPU as a parallel processor that could be programmed in something resembling C, rather than as a graphics pipeline that had to be tricked into doing math. Every Nvidia GPU shipped after 2007 included CUDA support, including consumer cards. Every PhD student doing simulation work could now use a $400 graphics card from Best Buy to run a workload that previously required a $50,000 cluster.
The strategic case inside Nvidia at the time was uncomfortable. CUDA was not a revenue line. Nvidia gave it away for free. The Tesla-line chips for serious compute customers were a small business, in single-digit millions of dollars of annual revenue for several years. The R&D investment to support CUDA — compilers, libraries, documentation, developer relations, a parallel programming research team — was significant and ongoing. For most of the period from 2007 to 2014, CUDA cost Nvidia more than it brought in. Huang kept funding it anyway, on the thesis that parallel programming was the future of high-performance compute and that whoever owned the leading parallel programming environment would own the resulting business.
AlexNet, 2012 — the option starts to vest
In September 2012, three researchers — Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton at the University of Toronto — submitted a neural network called AlexNet to the ImageNet image-classification competition. AlexNet won by a margin that startled the field: 15.3% top-5 error against the next-best entry at 26.2%. The technique they used (a deep convolutional neural network trained with backpropagation) had been around for decades, but two things had changed. ImageNet had provided a large enough labeled training set. And Krizhevsky had trained the network on two consumer Nvidia GTX 580 GPUs in his bedroom over a week — a compute scale that would have been impractical even five years earlier.
The ImageNet result was the moment GPUs stopped being 'graphics chips that could occasionally do science' and became 'the substrate of deep learning.' Within two years, every serious computer-vision research group was using Nvidia hardware, almost always with CUDA. By 2015, Google had published TensorFlow (which ran on CUDA first, TPU second). By 2017 the transformer paper had landed at Google Brain (training on TPUs, but most of the research field replicating on Nvidia). By 2020 GPT-3 had been trained on roughly 10,000 Nvidia V100s.
The fifteen-year CUDA investment had been built up exactly to be ready for this. When the workload arrived, the libraries existed, the tooling worked, and the entire deep-learning research community had already trained itself on Nvidia hardware. There was no alternative path. Intel had MKL-DNN, AMD had ROCm, Google had TPU+JAX — but the broader ecosystem (PhDs, libraries, papers, tutorials, GitHub repositories) ran on CUDA, and the network effects of that compound made it nearly impossible for a competitor to catch up.
The 2022-2024 inflection
From 2015 to 2022, Nvidia's datacenter revenue grew steadily but unspectacularly — from about $300M to about $15B over seven years. The crypto-mining boom briefly inflated GPU demand in 2017 and 2021 but did not establish a durable customer base. The real inflection was generative AI.
ChatGPT launched in November 2022. By the second quarter of 2023, Nvidia's datacenter revenue had roughly doubled year-over-year. By Q3 2023 it was three times the prior year. By Q1 2024 it was five times. The H100, launched in 2022 as the successor to the A100, became the most demanded single product in the technology industry. Lead times stretched to 12 months. The chip's contribution margin was reported to be over 80% on a price north of $30,000 per unit. Nvidia's stock price went from roughly $150 in October 2022 to over $1,000 in February 2024 (pre-split). Market capitalization briefly exceeded Apple's, making Nvidia the most valuable company in the world.
What had taken fifteen years to set up paid back in eighteen months. The CUDA ecosystem, the parallel hardware roadmap, the NVLink fabric, the DGX and HGX system designs, the relationship with TSMC for advanced packaging — all of it had been built quietly through years when none of it mattered very much in revenue terms. When the AI workload arrived, the ecosystem snapped into place and everything Nvidia had been quietly compounding became the only available answer to the question 'how do I train a frontier model?'
Source: Nvidia 10-Q filings FY2023-FY2025; *The Information* and Reuters coverage of H100 lead times; Krizhevsky et al., "ImageNet Classification with Deep Convolutional Neural Networks," NeurIPS 2012.
Strategic read
The Nvidia story is sometimes told as a single great bet that paid off. The more accurate framing is that it is two great bets, fifteen years apart, with the second only possible because the first had been managed carefully through the period when nobody noticed. The 1996 bet was on consumer 3D — survival-grade, defensive. The 2006 bet was on general-purpose parallel compute — strategic, offensive, and entirely speculative for at least a decade.
What enabled Nvidia to hold the CUDA bet through ten years of looking foolish? Three things. First, the company was profitable on graphics throughout — CUDA was a side bet, not a do-or-die one. Second, the CEO had been there the whole time and had a personal conviction strong enough that nobody in the company seriously challenged the investment. Third, the company had no comparable internal alternative to fund — there was no parallel team building a different vision of the future that CUDA was competing against. The result was that CUDA accumulated compounding investment for fifteen years without interruption.
The investment lesson is that some of the most valuable industrial positions are built during periods where the position looks wrong to the surrounding consensus. The reason most companies do not make these bets is structural: shareholder pressure, internal politics, CEO turnover, or the difficulty of justifying the cost of an investment whose payoff is uncertain and whose timing is wrong. Nvidia avoided every one of these traps, partly by luck and partly because Jensen Huang ran the company as a private fiefdom even after the IPO. The investor asking 'who is the next Nvidia' should look for the same configuration today — a profitable core business, a long-tenured CEO with technical conviction, and an unpopular side bet on a structural technology shift that the market has not yet priced in. Several candidates exist. None of them are obvious yet, which is the point.