AI Silicon – The Inference Game

The time for niche offerings is now.

  • The market for inference is rapidly opening up, and for the newcomers, the timing is perfect, but Nvidia already has a number of tricks up its sleeve to ensure that it does not cede too much share to the upstarts.
  • I have long been of the opinion that the market for AI silicon will be dominated by inference as opposed to training, and as the market grows, inference is beginning to dominate, and it is also becoming more specialised.
  • The process of inference has certain stages such as understanding the request, “thinking” about the answer and then constructing the answer.
  • For each of these, the processing requirement and the balance between processor, memory and networking is slightly different, meaning that running each of these on different silicon chips should give more output (revenue) per $ invested or watt consumed.
  • This is what the industry refers to as disaggregation, which opens the market up for niche players and as the market is now so large, it can be economically viable to focus on just one of these niches.
  • This is what Nvidia’s acquisition of Groq was all about, as Groq is very good at producing tokens very quickly, which is a segment of the market that commands a much higher price.
  • This looks like just the beginning of this trend, and the bigger the inference market becomes, the more space there will be for niche players to win a piece of the market.
  • Innovation in this space is developing very quickly, with players like Etched, which has just raised $800m, offering a full server rack focused purely on transformers and Qualcomm, which has a new memory architecture that fixes the memory bottleneck without using expensive high-bandwidth memory (HBM).
  • Furthermore, the ubiquitous CUDA platform, which ties AI companies into Nvidia silicon, is much less sticky in inference, which also makes it much easier for newcomers to break in.
  • This is where Modular comes in, which is a software platform that supports all silicon vendors, which, if the company is to be believed, can run inference on Nvidia silicon as well as CUDA can.
  • Qualcomm announced that it would acquire the company last week, immediately raising questions about its independence and whether other chip vendors will now stop using it.
  • Qualcomm is making a big effort to keep it independent, and given the success of Anthropic’s MCP protocol for agent communications, there is a good chance that Modular will continue to be used even when owned by Qualcomm.
  • Against this incoming competition, Nvidia still has plenty of strategies to ensure that it remains dominant, including its ability to produce next-generation silicon before anyone else and its move up through the technology stack.
  • Being first means that it can rightly claim better token economics even at 75% gross margins, and its move into producing models and reference designs is all about keeping clients coming back for more chips.
  • That being said, the market is becoming so large, and it is growing so quickly that there is plenty of demand for everyone, and so while I suspect that Nvidia will lose share in inference, it will still grow fairly quickly.
  • This state of affairs looks set to continue at least until the end of 2027, and so the time to launch a new offering for inference in the data centre is right now.
  • I suspect that some of the smaller niche players like Etched will be acquired by large players as they look to round out their portfolios to address the market as it becomes more specialised.
  • As the market continues to grow and Nvidia’s share price stagnates, it is starting to look quite reasonable again with a 2027 PER of 15.8x while still offering 20% YoY growth in EPS.
  • King of the valuation opportunities remains Qualcomm, which upgraded its 2029 non-handset revenue target by 82% as its data centre product already has traction with 4 hyperscalers, but the shares have fallen 10% since the announcement.
  • This is where I already have a position and am looking to add a little more here to bring my holding up to full weight, as I can get to a valuation of $360 per share without trying very hard.  
  • I also continue to hold Samsung Electronics to benefit from the memory madness and nuclear power as a major way in which data centres will source their energy needs.

RICHARD WINDSOR

Richard is founder, owner of research company, Radio Free Mobile. He has 16 years of experience working in sell side equity research. During his 11 year tenure at Nomura Securities, he focused on the equity coverage of the Global Technology sector.

Leave a Comment