Artificial Intelligence – Whoops Microsoft

Microsoft accidentally makes the case for edge AI.

  • A simple mistake by employees exposed terabytes of Microsoft’s internal data to the world underlining that generative AI is still at the Wild West stage and that inference and even training at the edge is a safer prospect in the long-term.
  • Microsoft’s AI researchers posted a repository of training data on GitHub for everyone to use but accidentally included 38 terabytes of Microsoft’s internal information and set the access permissions such that anyone with the URL could access and edit it.
  • This data and permissions were set in 2020 meaning that anyone could have accessed the data over the last three years.
  • The data included Microsoft PC backups and internal messages but there was no customer data or anything that was particularly sensitive.
  • This is clearly a storm in a teacup, but it does highlight some of the problems of running and executing large language models (LLMs) in the cloud.
  • Generative AI and the use of LLMs is at a very early stage where everyone is racing to get models into the market to claim virgin territory and no one seems to have much time to worry about the data they are using or what happens to it.
  • This is the AI equivalent of Meta Platform’s philosophy of “move fast and break things” 5 or 6 years ago.
  • ChatGPT has already been accused of violation of copyright for the use of content in training which is then recycled and has also been seen to leak confidential data that other users have used to prime the model.
  • At the moment, the best-performing models are extremely large (hundreds of billions of parameters) and are trained with massive amounts of data making any form of training or inference on personal computers completely impractical.
  • These models are popular because they are able to create the illusion of generalisation, but I think that the reality is that they contain so much data that they can find data on almost any subject that they get asked about.
  • However, research is finding new tricks and techniques that reduce the compute and storage overhead for model training and allow enthusiasts and tinkerers to fine-tune Meta’s LlaMa models on powerful personal computers.
  • Furthermore, research is indicating that smaller models trained for longer can give as good, if not better results further increasing the possibility for inference and potentially training at the edge.
  • This is how the open-source community has been able to flood Hugging Face with hundreds of smaller, but quite capable models that can perform certain tasks.
  • Furthermore, we are also beginning to see smartphone chipmakers creating chips that can run inference of these models on smartphones even when there is no network connection available.
  • RFM research (see here) has concluded for models that run at scale like ChatGPT, the cost of inference is much greater than the cost of training and being able to run inference on the user’s device rather than in the cloud greatly reduces the cost for the service provider.
  • This is the equivalent of companies moving away from providing employees with devices but allowing them to bring their own and then installing software on them.
  • Running inference at the edge also confers the advantage of privacy and security as none of the priming data or detail of the request ever leaves the device.
  • The technology is still very far from being able to run useful and effective generative AI on edge devices, but crucially, both the economics and the research indicate that this is the avenue to pursue in the long term.
  • This, combined with the entanglement and hallucination problems caused by large model sizes is why I think that in time, the models will get smaller, more specific in their tasks and will predominantly run inference at the edge.
  • Currently, almost all of the effort is going into creating silicon that can train and run bigger and bigger models in the cloud more quickly and more efficiently but this is not the real opportunity that I see.
  • Instead, the place to look is those companies that are trying to find ways of running AI in edge devices as opposed to the cloud training chip designers who are already very richly valued.
  • It is in this currently overlooked segment where the real money could be made.

RICHARD WINDSOR

Richard is founder, owner of research company, Radio Free Mobile. He has 16 years of experience working in sell side equity research. During his 11 year tenure at Nomura Securities, he focused on the equity coverage of the Global Technology sector.