March 2023

March 2023

From: Brian and Tobias

Subject: A brief NVIDIA story

The tech world is hard pressed to find a public stock hotter than NVIDIA. Its stock price is up 90% this year and the company seems poised to capture tremendous value from the Cambrian explosion of AI currently taking place. However, NVIDIA is vulnerable, with an entrenched stack – flexible GPUs and a powerful software layer to manage those hardware assets – that has historically been an advantage but is also costly to end customers. We believe NVIDIA could face an Innovator’s Dilemma and fall victim to its own success, creating a market in which its own products are not the perfect solution.

This month’s Infrastructure Newsletter from Brian and Tobias is rather different than prior updates. We’ve been thinking a lot about Generative AI, and a few themes consistently emerge: a desire for application-layer businesses to control costs and own their models. The latter points to a problem for OpenAI (a topic for another newsletter) and the former points to a problem for NVIDIA, the topic of this month’s newsletter. We’ll conclude with ideas for where opportunities exist for startups.

When we first started thinking about engineering infrastructure a couple years ago, we were amazed by the sheer complexity of the chip industry. We looked at a few deals but decided to stay away because of how powerful the incumbents are and how capital intensive starting up is. After all, NVIDIA has been around for 30 years and has over $30B in revenue, and giants like Intel have been around since the ‘60s and have 2x the revenue. But the more we learn, the more we think innovation is needed. And all great startups are necessarily high-risk, high-reward endeavors, with the deck stacked against them. While precarious, opportunities exist.

‍

A Brief NVIDIA History:

When NVIDIA emerged it was an improbable startup, entering a highly competitive industry with a premium offering that integrated two distinct functions – 2d and 3d graphics – onto one chip. This innovation enabled NVIDIA to take a dominant position in the Graphics segment, which occupied effectively 100% of its business from 1993-2008. Towards the end of this period, GPUs (Graphical Processing Units) began to emerge as an effective way to run heavy-compute workloads needed for scientific research, and NVIDIA made a long-term bet on this trend. In 2008, NVIDIA introduced CUDA (Compute Unified Device Architecture), which allows engineers to more flexibly program NVIDIA GPUs. NVIDIA also introduced technical innovations to the GPU design before competitors. Before long, GPUs were unlocking better graphics than ever before, and NVIDIA was the market leader in this technology.

Imagine a video game where a player throws a grenade towards a moving car. If the grenade hits, the car explodes. Think of the physics going on here: the weight of the grenade and the force in the arm; the speed of the car and the distance between it and the grenade thrower; the shadow the grenade casts on the ground, based on the time of day! GPUs have an amazing ability to run all of that math simultaneously to create a shockingly realistic digital experience.

Over the course of the next decade, an unintentional boon manifested: neural nets and deep learning. The idea of neural nets dates back to the 1950s when computer scientists and AI researchers (they weren’t called that then) postulated that a computer model that mimics the structure of the human brain may be able to achieve good performance on tasks such as vision and language. Our brains contain billions of electrical pulses that interact with each other, sending many messages simultaneously, to create coherent experiences of objects like “dog,” words like “cat” and concepts like “pet.” When neural network-like models are mapped onto processors they result in parallel and computationally intensive sets of equations. GPUs were particularly good at computing these equations quickly, enabling larger neural networks to be developed and for those networks to be trained on larger datasets.

As a result, GPUs catalyzed the machine learning revolution. In the 2010’s ML got really good at tasks like identifying images, translating text, making predictions, and identifying anomalies. Pretty quickly, ML was powering many underlying functions of our digital lives, like an ad served to you online or a match with a specific Uber driver. In this context, NVIDIA was in a position to power, quite literally, the AI revolution. With Generative AI coming online, NVIDIA seems perfectly positioned to go to new heights. But there is more to the story.

‍

Let’s understand what NVIDIA looks like today. NVIDIA breaks up its business into two core categories: (1) Compute & Networking, and (2) Graphics. To support these two broad product lines, NVIDIA has dozens, if not hundreds, of hardware and software products, centering around GPUs and other AI-related systems (e.g. high-performance computing enabled by Mellanox acquisition) and CUDA. AI has led to the rapid growth of NVIDIA’s compute & networking business line, which accounted for only ~$3B in revenue in 2020 but $15B in 2022. In that same time period, the graphics segment has grown from $7.6B to $12B, a much more modest increase. In total, NVIDIA generated around $27B in revenue in 2022.

NVIDIA has been able to constantly stay ahead of the pack delivering the most performant AI hardware. Specifically, the H100 semiconductor is seen as the top of the line chip for AI use cases. Across all relevant metrics, it represents a step change improvement over what has come before, enabling 30x speed ups of LLM computation over previous generations.

The hardware is only one piece of the puzzle for NVIDIA. Perhaps even more important is its software interface, CUDA. CUDA is a parallel computing platform and application programming interface that enables engineers to program GPUs to perform specific tasks. Perhaps the great power of NVIDIA has not been just developing arguably the world’s best hardware for complex, high-compute use cases, but rather making that hardware easily programmable through CUDA. GPUs are powerful machines capable of doing a range of tasks across different disciplines. CUDA helps engineers unlock that possibility. NVIDIA has made a bet on building lots of different SKUs and marrying them all with an overarching software interface. This flywheel is key to the moat NVIDIA has built – hook customers with best-in-class hardware, get them using it through CUDA, and then make NVIDIA hardware all the more appealing in the future because customers are already using the associated software.

The software component of NVIDIA’s business is not only about creating a stickier product. It is also a big revenue driver for the company. Included in this suite of offerings are products to accelerate AI and ML pipelines and cloud offerings on top of chips. If CUDA is the way in which engineers tell NVIDIA hardware what to do, the software component of NVIDIA’s business is the way in which engineers then understand how their hardware is performing and manage assets. Last year, NVIDIA’s CFO laid out the roadmap for how the business becomes a $1T revenue business. The breakdown was as follows: $100B from gaming, $300B from chips and systems, $150B from AI enterprise software, $150B from omniverse enterprise software, and $300B from auto. At least 30% of revenue here is just software that sits on top of NVIDIA’s chips.

This whole system – a focus on AI use cases and the interconnectedness of NVIDIA’s software platform with its hardware products – has justified investors paying a premium in public markets. NVIDIA trades at ~25x EV/revenue and ~110x EV/EBITDA. In contrast, Intel, a more traditional hardware player with less traction in the AI market and slower growth, trades at 2.3x EV/revenue and 6.9x EV/EBITDA. To put that difference in perspective, Intel does ~4x more in EBITA than NVIDIA but is valued ~7x less. Importantly, NVIDIA’s growth rate in recent years far eclipses Intel’s. Since 2017, NVIDIA’s revenue has increased by almost 4x, whereas more traditional players’ growth has flatlined.

‍

What is a chip?

So what does this unbelievably impressive company make? Let’s pause for a moment to indulge in the pure wonder that is science and technology.

Chips allow for the flow of electricity in a controlled way; they manage this flow of electricity via transistors, which create on/off messages, the substance of code. Chips are tiny and can facilitate an incomprehensible volume of pulses at unbelievable speeds. A chip in your phone is just a few millimeters in size (about half a grain of rice!) and will contain north of 10 billion transistors that can manage the flow of energy simultaneously.

Building a chip requires precision at an atomic level. This includes arranging these billions of transistors in a dense grid. Picture laser beams etching grooves the width of a few atoms (there are 1.5 quintillion atoms in a grain of sand!) so energy can flow on it. Then gaseous substances in extremely hot environments settle into the grooves. After many steps, the chip, about the size of a baby's fingernail, gets placed on a board, which then gets packaged up and becomes the brain of a computer. This is the stuff of fiction, made reality.

NVIDIA doesn’t manufacture chips. This model is referred to as “fabless,” where the chip designer does not own and operate the fabrication facilities (fabs). NVIDIA designs them, creates the software layer (CUDA) used to program the chips, and manages their manufacturing supply chain, which includes pre-orders years in advance for the most advanced machinery at a fab. Importantly, there are massive companies just focused on helping better design chips, including Cadence ($57B market cap), Synopsys ($60B market cap), and others.

The rapid advancement in chip design is no coincidence. In fact, it follows a fairly predictable and steady upward trajectory known as Moore’s Law, which states that the number of transistors in an integrated circuit doubles roughly every two years because of various technological advancements in the computing and semiconductor fields. This law, which has resulted in chips that have dropped in price and increased in power over time, is arguably on the verge of coming to an end. NVIDIA CEO Jensen Huang flatly said, “Moore’s Law is dead” in late 2022, saying not to expect rapidly declining GPU prices moving forward. So, in the eyes of NVIDIA, we may be at an inflection point at which this amazing technology stops evolving as rapidly as it has over the last several decades. The ramifications of this would be widespread, especially as AI use cases become more prevalent.

‍

NVIDIA in the age of Generative AI:

As you’ll recall, NVIDIA GPUs inadvertently and fortuitously catalyzed the machine learning revolution. Generative AI is a step function in said revolution, not only in terms of capability but also cost. NVIDIA is poised for a killing because these workloads are computationally demanding and very, very expensive. Appropriately, their stock is skyrocketing as Generative AI takes the world by storm.

‍

The costs, however, are now exorbitant. There are two distinct costs – training the models and running the models, which is called inference. Generative AI companies and all the incumbents integrating Gen AI into their offerings need to contend with this. As an example, if Google integrated ChatGPT into their search offering, it would add $35B in COGS, cutting meaningfully into their margin. Additionally, ChatGPT is estimated to pay ~$400M for AI inference annually at current usage rates.

‍

Small startups are already dealing with these cost issues. We spoke with the CEO of a Gen AI business doing ~$20M in revenue. They’re spending $5M on inference! As he said, for every $4M of revenue his company generates, he gives $1M to OpenAI. This dynamic exists across the AI landscape, as a16z writes, “We estimate that, on average, app companies spend around 20-40% of revenue on inference and per-customer fine-tuning.”

While this seems great for NVIDIA, it puts pressure on OpenAI to find cheaper solutions. Never before have software products had such un-software-like COGS. We’re talking about 60%-75% gross margin software tools! Strange.

Many big players have been hard at work on this for a while. Meta and Google are deeply incentivized to not rely on NVIDIA GPUs. Their path to breaking free has been about developing compilers and frameworks for machine learning that are chip agnostic. Pytorch and TensorFlow respectively are both designed to do this, creating the possibility to break free of the NVIDIA ML chip monopoly.

Solving this cost problem has become potentially existential for NVIDIA. In Dylan Patel’s article, “How NVIDIA’s CUDA Monopoly in Machine Learning is Breaking – OpenAI Triton and PyTorch 2.0” (we recommend reading his newsletter SemiAnalysis for more on this subject), he claims that although NVIDIA has benefitted from the ease of use of its software platform CUDA for many years, there is trouble on the horizon. As companies like OpenAI, Meta, and Google come out with their own software platforms, the only thing that matters is efficient and economical chip architecture. He also claims that NVIDIA made a huge strategic mistake by not making its software platform extensible to other hardware, instead only making it compatible with the NVIDIA suite of products. In his words, “NVIDIA’s colossal software organization lacked the foresight to take their massive advantage in ML hardware and software and become the default compiler for machine learning… Why aren’t they the one building a simplified CUDA like Triton for ML researchers?”

If NVIDIA loses the edge on hardware, they lose the edge overall. NVIDIA’s beautifully symbiotic relationship between hardware and software that created a moat for the business might be for nought in an age where the only thing that matters is cost, and NVIDIA can’t deliver on that core business need. Given the rapidly growing need for AI hardware, and the increasingly frustrating cost of using that hardware, perhaps the opportunity is now to catch NVIDIA off guard. Its software is no longer the only game in town, and its hardware is leaving a lot to be desired for LLM use cases.

Given these cost problems, the obvious question is: why doesn't NVIDIA just reduce prices? The answer, so far, is that they haven’t had to because of their market dominance. However, perhaps they will need to in the future if viable alternatives emerge. NVIDIA enjoys a healthy margin (57% gross margin, ~15% EBIT margin) and unparalleled scale, so it has room to reduce price if that’s what the market demands. However, whether they do this remains to be seen. They could decide to price their service similar to Apple’s iPhone, opting for a premium positioning that they can justify because of dominance in the market.

More importantly, however, it’s not entirely clear that NVIDIA can bring prices down enough to remove the price bottleneck in the industry and remain profitable. To make it dead simple, if the market only really adopts LLMs when compute COGS are 6% and not 30%, does NVIDIA’s current architecture facilitate a 5x price reduction? That is unclear. That may necessitate different designs and more efficient chips. Whether those are developed at all, by NVIDIA, or by another company remains to be seen.

‍

What’s Next?

To sum up our point of view on all of this: we’re at a transition point, with lots of opportunity to change the status quo. NVIDIA is the de facto dominant player in the world of AI hardware, and they’ve built a phenomenal business. However, costs are prohibitive using their hardware on new AI models, and NVIDIA seems to believe we should no longer expect rapid cost and capability improvements in chips because of the end of Moore’s Law. This has the makings of a potential Innovator’s Dilemma – a company at the peak of its might unable to move fast enough and disrupt themselves to adapt to what’s coming.

There are a bunch of themes we’ll be exploring with this history and context in mind:

Software systems to better help design chips for an LLM age – Synopsis specifically for Generative AI
Fit-for-purpose hardware built with LLMs in mind
If NVIDIA’s moat is pierced with different hardware, there will also need to be software that emerges to manage that infrastructure – similar to CUDA for LLM hardware

‍

We will see how the world changes, and what infrastructure is needed to enable what’s to come. NVIDIA is at a crucible moment, one where it will strategically embrace how AI is evolving or stick to its current way of doing things. The latter may be enough, but it might not. If it’s not, lots of new businesses will emerge going after $700B of market cap.

‍