OpenAI & Broadcom – Transistor to Token

A bold bet with yet another frame order.  

  • OpenAI has signed another colossal deal, but this time it is for its own in-house inference system that it is betting can beat both Nvidia and AMD, which is a bold bet to make when you consider that OpenAI has no experience whatsoever in designing silicon chips.
  • The new deal is for a further 10GW (~$100bn) of AI compute that will be built in conjunction with Broadcom and begin roll-out in H2 2026 and be essentially complete by the end of 2030.
  • This brings the total capacity that OpenAI has signed deals for to 30GW, which is 15x more than the 2GW that OpenAI has running ChatGPT today.
  • This triggers an immediate pause for thought as OpenAI is able to serve roughly 10% of the world’s population with a reasonable service level with just 2GW.
  • Given that the next generations of silicon will be more efficient, one immediately begins to wonder what the other 28GW of compute will be used for.
  • OpenAI’s hope is that if it builds it, the demand will come, and there are some reasonable arguments for this, but I suspect that, like the Internet, it is going to take more time than expected for the use cases to materialise and to offer a decent return on investment.
  • This is why all of these deals are structured as frame orders, where the total and the price are fixed but are filled by the client as when it needs the product.
  • The main terms of the deal are:
    • First, Full custom: where OpenAI designs the custom accelerator and the CPU that controls it, and Broadcom designs the communications and networking part of the system.
    • This is a fully integrated system where the design from transistor to token is all designed by OpenAI.
    • The hope here is that this vertical integration will enable it to offer better performance in terms of tokens/watt and tokens/$ than anything that AMD and Nvidia can come up with.
    • This is a pretty bold assumption (see below).
    • Second, Inference: where the fully custom system will be used for inference of models that have been presumably been trained on the 10GW of capacity that OpenAI has signed up to buy from Nvidia with its equity (see here).
    • If the use of AI is about to explode as OpenAI says it is, then efficiency is going to be the name of the game, and this is where OpenAI’s custom solution needs to really excel.
    • Third, Ethernet: which is the open standard that will be used for communication as opposed to Nvidia’s proprietary NVLINK system.
    • This is Broadcom’s main contribution to the collaboration, but I suspect that Broadcom will also organise the manufacturing of the OpenAI chip and its assembly into a system like the ones that Nvidia sells.
    • Fourth, 3 years: which is a pretty tight time frame.
    • Roll-out will begin in H2 2026, and OpenAI thinks that the entire 10GW will be deployed by the end of 2029.
    • This is a very aggressive time frame, and it is here where I begin to wonder whether this level of demand will occur before this deadline.
    • This is why these deals are all frame orders, meaning that if the time frame slips, it won’t matter very much, although the listed companies will have to deal with underperforming expectations of the public market.
  • The net result is that OpenAI and Broadcom are making the bold bet that they can out-design and out-execute Nvidia and AMD when it comes to maximising token output per dollar invested and watt of power consumed.
  • However, Nvidia is currently the king of the hill when it comes to these metrics.
  • This is not because its silicon is massively better than anyone else’s, but because it is one generation ahead of everyone else, it has a fully customised communication system and has software tools that can optimise an entire data centre (Dynamo).
  • Furthermore, Nvidia has been designing AI systems for well over 10 years, and the idea that OpenAI will arrive with no experience and beat it at its own game is a pretty wild assertion.
  • However, Nvidia’s and AMD’s systems are not optimised for one type of model, and it is here where I suspect that OpenAI is hoping that it can get one over Nvidia and AMD.
  • I think that this is a pretty tall order, and so we will see when the system is launched just how well it measures up on tokens/watt and tokens/$.
  • My suspicion is that it is here where expectations are overhyped, as we are clearly not on the path to superintelligent machines, meaning that demand will be a factor of usefulness and return on investment rather than the compute capacity required to retire the human race.
  • Hence, I can see the roll-out being much slower than expected as everyone cuts back and decides to roll out to meet demand rather than expecting that demand will materialise because the capacity is there.
  • This correction will be hardest for those without cash flow, and so I expect that most of the LLM companies will end up being acquired.
  • OpenAI’s independence will depend on its ability to generate cash, which at the moment is in a poor state, as while it has billions in revenue, it is still burning billions every quarter.
  • In the long term, I suspect the capacity will be built, but by whom and who makes the real money remains to be seen, although Nvidia remains in very good shape.  

RICHARD WINDSOR

Richard is founder, owner of research company, Radio Free Mobile. He has 16 years of experience working in sell side equity research. During his 11 year tenure at Nomura Securities, he focused on the equity coverage of the Global Technology sector.