Artificial Intelligence – The wall.

Now the real work begins.

  • OpenAI has admitted that the limitations of massive compute and massive data may have already been reached meaning that to make these systems useful, ways need to be found to implement them cheaply and make them less crazy.
  • At an event last week at MIT, Sam Altman, CEO of Open AI admitted that making the models bigger would no longer make them better, leading many critics to highlight that we are already at the point of diminishing returns.
  • This is something that was already evident from OpenAI’s “research paper” on GPT-4 where a probable order of magnitude increase in the size of the model produced linear improvements in performance.
  • This is typical of a technology where its maximum potential is close to being reached, which is something I have been expecting in deep learning for some time.
  • With everyone rushing headlong into this technology, this means that pretty soon there are going to be a large number of ChatGPT lookalikes all of which consume vast resources and which are all equally crazy.
  • This means that the value in AI research will quickly move from making the models bigger to making the models cheaper and easier to implement as well as reducing the hallucinations as much as possible.
  • This is what will be required to make the two commercial use cases (beyond entertainment) that I have identified become a reality.
  • These are the cataloguing of data within an enterprise and the man-machine interface in the vehicle (see here).
  • Both of these make use of the ability of large language models (LLM) to accurately understand the request being made as well as the context and circumstance of the request meaning that the model has a very good idea of what it is that it is being asked to do.
  • This has been a limitation of chatbots to date and the main reason why they are only used for telling the time, turning on lights and playing music.
  • LLMs are also very good at ingesting large amounts of information which can be easily retrieved without having to put the data into a database or label it.
  • The problem is that when they are asked something that they do not specifically know the answer to, they convincingly make stuff up meaning that the user needs to double-check everything that they produce.
  • They are also very expensive to deploy as GPT-3 with 175bn parameters needs 800GB to store and substantial compute resources in order to execute requests in a timely manner.
  • It is here where I think that the valuable innovations are going to be made because the more hallucination can be contained (or at least identified) and the cost to run them reduced, the more practical they become for real-world use cases.
  • Real-world use cases lead to revenue which at the end of the day is why almost everyone is in business.
  • In the enterprise use case, shared models are already proving to be problematic as there is potential for data leakage and in the vehicle, a voice service based in the cloud is not reliable enough.
  • Hence, both of these use cases are going to require an instance of the LLM to be either deployed on-site (i.e. at the edge or on device) or in a private cloud.
  • This means that ways need to be found to cost-effectively implement LLMs at the edge of the network and, in many instances, on the device itself.
  • This combined with addressing (or at least containing) the hallucination problem is how LLMs move from the realm of wild speculation into revenue-generating reality.
  • There is still a lot of work to be done as while many people can imagine the use cases, hardly anyone knows how to deploy it in practice.
  • Nvidia remains the go-to place for those wanting to invest in this craze, but I am starting to look for those that can reduce the limitations of these systems as well as efficiently deploy them in edge devices.
  • While Nvidia seems to have the training space locked up, there is a great opportunity in inference which, when at scale, could be much larger than training.

RICHARD WINDSOR

Richard is founder, owner of research company, Radio Free Mobile. He has 16 years of experience working in sell side equity research. During his 11 year tenure at Nomura Securities, he focused on the equity coverage of the Global Technology sector.