Google TurboQuant – Demand Driver

TurboQuant could easily increase demand for memory.

  • TurboQuant can halve the memory footprint of AI algorithms, but instead of driving memory demand down, cheaper compute will most likely stimulate further demand, meaning that memory requirements stay the same or even grow.
  • Google issued a product press release (dressed up as a scientific paper) detailing a new compression algorithm that it claims can reduce the memory overhead required for compute which caused memory stocks to correct even more than the skittish market last week.
  • TurboQuant (see here) is a set of algorithms of which the two main components are PolarQuant, which does the compression, and QJL, which eliminates the errors that occur as part of the compression.
  • The result is a significant reduction in the space required to store and run AI algorithms, meaning that the same algorithm can now be run with half or less than half of the compute that it needed before.
  • This is a modification of a technique known as quantisation, which has been around for many years and involves expressing the same data in a smaller number of bits.
  • This technique is already well known, especially in digital images, which is also a great example of the trade-off between image size and image quality.
  • TurboQuant appears to be able to take quantisation down to 2-bit, which is one stage further than the current standard of 4-bit.
  • The problem with 2-bit to date has been the loss of accuracy that results from too much compression, and as far as I can tell from the press release, this is the key proposition offered by TurboQuant.
  • Using 2-bit rather than 4-bit would double the number of tokens that a data centre can produce without increasing its costs, which, assuming that there is no decline in token price, would greatly help the economics of the data centre.
  • However, history indicates that this may be wishful thinking as the move from 32-bit to 4-bit did not help economics very much despite large increases in token production.
  • Either way, if 2-bit can become usable, then the memory requirement to run AI will drop by half, which is why the memory stocks fell by more than the market in the latter half of last week.
  • Micron, for example, has fallen by nearly 20% in the last 5 sessions, with SK Hynix down 14% and Samsung down 10%.
  • I think that these moves are overdone, as there is no sign from the market that the supply constraint on memory is easing, and I am not aware of anyone cancelling their order for high-bandwidth memory.
  • Furthermore, I suspect that if 2-bit compute becomes widely used, then demand for compute will increase further, and the amount of memory required will either stay the same or increase yet further.
  • Quantisation has yet to trigger a decline in demand for memory, and there is no indication that if the move to 2-bit becomes the industry standard, anything is going to change.
  • Hence, I think that this is a storm in a teacup which has been greatly exacerbated by world events, meaning that the memory companies have just gone on a 20% sale.
  • With the memory companies very likely to make or exceed their earnings estimates this year, they are trading on less than 10x PER, meaning that this is a good point to enter for anyone with no exposure.
  • The days of the semiconductor cycle with memory at its epicentre are far from over, but I think we are still at least a year away from the top of the cycle, meaning that there is further upside to come.  
  • I already hold Samsung Electronics and remain very comfortable hanging onto it as I think it still has further to run.

RICHARD WINDSOR

Richard is founder, owner of research company, Radio Free Mobile. He has 16 years of experience working in sell side equity research. During his 11 year tenure at Nomura Securities, he focused on the equity coverage of the Global Technology sector.

Leave a Comment