USA vs China – Distillation of Opinion

The IP theft argument is pretty weak

  • US legislation to sanction Chinese companies that use distillation to train their models is based on fairly weak arguments, and as the model companies are already working together to combat this problem, legislation seems to me to be a waste of time.
  • Distillation is a well-known process in LLM training where answers from larger models are used to train smaller ones that then are able to obtain much better performance relative to the resources that they consume.
  • This is obviously highly desirable in an environment where there is already intense competition and where everyone is trying to make a return for their investors.  
  • Many frontier labs have very large models that are simply used to train the smaller ones that are then made available for commercial purposes.
  • The problem is that with the models being available from anywhere through a web browser or an API, anyone can access them and use them to train their own models.
  • I think that, because the models are publicly available for a fee, the idea that using them to train another model to mimic its performance counts as IP theft is a bit of a stretch.
  • This is not Alibaba and co breaking into Anthropic and stealing its source code but simply using the available tools in a certain way.
  • Furthermore, the companies are perfectly within their rights to ban the practice of using their models for distillation and they have or will soon have sophisticated tools that will detect and restrict this activity.
  • Consequently, I do not see a need for legislation, and I think it will fail as it will need almost all of the rebublicans to support it in order to pass.
  • Furthermore, the bill’s argument that the Chinese run small light weight models simply to undercut their Western rivals is demonstrably false.
  • China runs small models because it has no choice, and it lives in the open source because it makes exporting its models easy and because open source is part of its strategy to dominate the AI industry.
  • Jensen Huang recently stated that China is not bound by a lack of computing power due to its abundant energy, but the reality is that if you ask any AI lab in China, it will tell you that availability of compute is its biggest problem.
  • This is because they can not import advanced AI chips, mainly due to Chinese state policy, and the chips that they can get access to are not good enough.
  • RFM Research has concluded that the Chinese-made AI chips are so far behind everyone else that they are completely uneconomical to deploy at scale.
  • This is why China’s main way to access economic and capable compute is to import it by renting data centres in countries like Thailand, Malaysia and so on.
  • This is an effective workaround, and I suspect that to offer AI services at scale in China, almost all of the compute will need to be imported from overseas.
  • This is why I expect that when the US Department of Commerce updates its rules in Q3 or Q4 of this year, it is the practice of selling advanced compute to China from neighbouring countries that will be targeted.
  • In my mind, this is where US government action might make a difference, as the practice of distillation of US models by Chinese companies is something that is more easily controlled by the private sector.
  • If successful, this would be a hammer blow to China’s AI ambitions and would have the impact of closing a major loophole in the strategy of containing China’s rise as a technological and geopolitical power.
  • Hence, this is the issue to keep an eye on, meaning that legislation on distillation is fairly meaningless and unlikely to make it into law.

RICHARD WINDSOR

Richard is founder, owner of research company, Radio Free Mobile. He has 16 years of experience working in sell side equity research. During his 11 year tenure at Nomura Securities, he focused on the equity coverage of the Global Technology sector.

Leave a Comment