The GPU Market Beyond the Hyperscalers (excl. Enterprise)

Sep 02, 2024

Following Wednesday’s earnings call, Nvidia CEO Jensen Huang relayed (through Bloomberg) that hyperscalers represent roughly 45% of their data center business. This is contrast to a quote from CFO Collette Kress in 4Q:

In the fourth quarter, large cloud providers represented more than half of our data center revenue, supporting both internal workloads and external public cloud customers.

Going from 51% of this quarter’s data center revenue to 45% would give you a ~$1.5b decline, and that’s at the very low end. Say that the percentage of hyperscaler revenue was 55% — all of a sudden, that 10% decline is $2.6b. So what’s driving that, and is it actually important if the broader business is doing well? To answer that question, let’s quickly summarize what the hyperscalers are doing.

In the context of Google, they have a formidable platform surrounding the Tensor Processing Unit (TPU), a custom ASIC designed specifically for AI and machine learning (ML) workloads. TPUs offer a solid alternative to Nvidia GPUs, especially within Google's own infrastructure and Google Cloud. They are also custom built for ML tasks, whereas GPUs are more general-purpose, applying broadly to parallel computing such as graphics rendering and AI.

For Microsoft, the Microsoft Azure Maia AI Accelerator (gotta find a way to shorten that) is an in-house development and part of a broader strategy to reduce reliance on Nvidia’s GPUs. They developed custom AI chips under the "Project Brainwave" initiative, which powers real-time AI processing on Azure. On that note, it’s pretty clear that Microsoft, at least tangentially, is looking past the training market that Nvidia has a stranglehold on, and towards 5-10 year adoption cycle of edge applications.

It’s a bold strategy, one I hope to probe as I better understand edge AI broadly and Apple’s strategy as it relates to Apple Intelligence. They’ve also been enhancing their ARM-based chips for Surface devices, so it does seem to be a device-driven strategy as opposed to the compute focus of everyone else. It’s bold, and brings to mind the Fairchild focus on the consumer applications of transistors when everyone else was focused on military. While I want to do a whole article on Microsoft, let’s bring it back to Nvidia.

Summing it up, these guys are getting a dose of good old-fashioned vertical integration, and that’s what’s driving the decrease in percentage of Nvidia’s data center revenue from hyperscalers. And yet, data center revenue is still finding a way to grow at a well-fed 154% y/y. And therein lies the broadening of spend and it’s perpetrators: sovereign AIs and Tier 2 CSPs. The long-tail enterprise world will also fall in this bucket, but frankly there’s just too much there to dive into that it’ll require it’s own article at some point.

Sovereign AI

Nvidia defines sovereign AI as “a nation’s capabilities to produce artificial intelligence using its own infrastructure, data, workforce and business networks.” When I hear this, my mind jumps firstly to smart cities. Imagine a New York subway system where, instead of having to pay with your card or phone, you simply had to look into a camera with facial recognition synced to payments systems such that the fare could be deducted from your account. Similarly, imagine a world where roads are equipped with sensors and cameras to deviate autonomous vehicles from an accident’s path and optimize traffic flow. These all sound somewhat futuristic, but then again, so did smart TVs and handheld computers in the 1990s. Either way, let’s look at a more current example.

In the context of sovereign AI, countries are leveraging their digital public infrastructure (DPI) to get us closer to these seemingly utopian outcomes, which we can explain through the example of India. Before 2009, identifying oneself in India was a hassle. In fact, 400 million Indians were without a form of individual identity, leaving them stranded from the credit system and unable to claim public sources of welfare. Enter, the “Digital India” initiative, with it’s crown jewel, Aadhar. This card, by corresponding one’s identity with a 12-digit unique identifier, collects fingerprints, iris scans, and photographs, which are stored in a centralized database. When a person uses Aadhaar for identity verification, the system matches their biometrics against the database to confirm their identity. In the time since the initiative has been introduced, that 400 million number has been taken down to just ~100 million (estimated population — 1.4 billion — and the number of Aadhar cards issued — ~1.3 billion.

So how can we optimize these systems with Nvidia GPUs? Jensen has been on record saying countries “need to own the production of their own intelligence.” It’s easy to imagine the ways in which processing mass amounts of biometric data could improve Aadhar’s efficiency. Authentication requests are sometimes received in the millions, and the computational power that a GPU offers can accelerate the matching of fingerprints, if need be. Virtualization also ensures that a GPU could be spread across multiple workloads if Aadhar’s demand is low. To bolster the efficacy of the systems in place, India announced a $1.2b spending package to bolster digital infrastructure, highlighted by the purchase of 10,000 GPUs.

India is just one example. In countries where food demand exceeds supply, agricultural uses of AI can function as a kind of ERP, allowing the government to subsidize technologies that monitor and optimize crop yield and soil health, which then “feeds” (hehe) through to better utilization of fertilizer and other pesticides that save long run costs. Realistically speaking, much of the early application will be in military purposes. The DoD has already been granted $1.8b for AI defense initiatives, with that number only expected to increase in coming years given the rapid rise in malicious cyber attacks (see exhibit below). And if DARPA is any indicator, the military is typically the first-mover in early adoption.

Tier 2 CSPs

AWS, Google Cloud, and Microsoft Azure have been (and likely will continue to be) enjoying a healthy penetration of the cloud computing space with their offerings. Azure has been the best performing cloud platform, gaining market share of ~2% in 2023, while Google Cloud has experienced incremental gains. Broader Tier 1 CSP share gains can be attributed to the fact that we are early in the AI capex cycle, and thus larger companies with greater capital outlays will prefer the global reach and security guarantees of an AWS or Azure. As we shift into the middle of the investment cycle, I believe the Tier 1 share gains will be partially given back as the spend broadens out across SMEs, but also narrows in terms of the specialization needed.

Take the example of Digital Ocean. It’s core offering, droplets, are virtual private servers that can be used to host basic websites or small application. With a mission to “simplify cloud computing so that developers and businesses can spend more time building software that changes the world,” they have built out a platform for SMBs and niche market players seeking out CSPs that offer cost-effective, scalable AI/ML tools without the overhead of larger, more complex platforms like AWS. Pricing is also far easier to understand. Instead of paying for data transfers by the bit or a specific instance type, the entry level droplet will cost you $5 per month. As the AI base expands beyond the hyperscalers, we expect this simplicity of use to take priority in SMID-cap land.

This is not to say that the Tier 1 CSPs will stop growing, rather that the growth rate of spend coming from Tier 2 CSP customers will begin to outpace their Tier 1 focused peers as the focus shifts from the large-cap arms race of spend to broader applications (Morgan Stanley used the term “tech diffusion” to describe this). Over what time horizon this will take place, I am unsure, but in the long-run I see this as incrementally positive for Nvidia because Tier 2 CSPs (DigitalOcean, IBM Cloud, Oracle) are not equipped with the resources to pursue vertically integrated chip solutions. They do, however, have the resources to afford the industry standard: Nvidia.

Further, Tier 2 CSPs are more likely to specialize in healthcare, financial services, or government, with the ability to offer compliance and security features tailored to each industry. As the regulatory body of work gets tackled (which will happen over a long time), Tier 2 CSPs with their focus on specialization and customization, will emerge well-equipped to meet these regulatory demands. This shift will drive increased demand for high-performance computing resources, particularly GPUs, as these providers enhance their capabilities to handle industry-specific AI workloads. As a result, Nvidia stands to gain significantly as Tier 2 CSPs, lacking the resources to develop in-house chips, turn to Nvidia's industry-leading GPUs to power their specialized AI applications, thereby driving greater revenue into Nvidia.

Long-Term Growth Story Still Intact Despite Hyperscaler Chip Concerns

While I sadly don’t have a model built on Nvidia to inform me of the future volumes currently priced into the name, I have to believe that the enterprise market is the biggest unknown given that there’s no great bottom up analysis (that I’ve seen) sizing up the long-tail enterprise world. That’s something I hope a sell-side shop looks into, because investors will find value in even a shot on goal that stabs at how large the enterprise AI market will be. This way, we can at least start to game out the ROI of these investments instead of making statements like this one from Alphabet CEO Sundar Pichai.

The risk of underinvesting is dramatically greater than the risk of over investing.

Much as I’d like to believe that to be true, capex intensity (capex/sales) eventually has to find a floor, otherwise investors will get nervous about the hyperscalers.

Nvidia, on the other hand is simply the dealer. They work with the companies leading markets tomorrow, and they have ample exposure to the broadening out of spend, not to mention investments in dozens of companies across the AI ecosystem. As they say, the house always wins.

Nikhil and Xander's Blog

Discussion about this post