GPT-4 – The law of diminishing returns

Many more monkeys. Still no Shakespeare.

  • Open AI has launched GPT-4 upon the world but while it records steady improvements in performance it refuses to disclose how many parameters or much compute it took to create raising the possibility that this game is becoming so expensive to play that it will never be commercially viable.
  • GPT-4 is the version of OpenAI’s generative foundation model and varies from GPT-3 in that it is larger (possibly 100tn parameters) and that there was human intervention in its training.
  • 100tn (if correct) is a staggering increase in size being 571x the size of GPT-3 and frankly, I am surprised that OpenAI was able to find that much data in existence.
  • It will also be vastly more expensive to run in terms of electricity and also to build requiring many more processors to run and memory to store it.
  • OpenAI is refusing to disclose anything regarding the architecture, size, hardware, training compute, dataset construction or the training method for competitive reasons (see here section 2 first paragraph).
  • It says that this has been done for competitive reasons (for which there is an argument given the degree to which Microsoft has panicked Google) but I suspect that the resources that were consumed to create, train and run GPT-4 were exponentially greater than for GPT-3.
  • GPT-4 is demonstrably better than GPT-3 at taking standardized tests like GREs, SATs and the legal bar exam going from scoring in the 10th percentile to the 90th but the system still has a tendency to hallucinate.
  • In fact, GPT-4 has all of the same limitations that are inherent in GPT-3 meaning that in terms of making GPT more aware and more causal in its understanding, there has been no progress at all.
  • This is fundamental because causal understanding is the central limitation of all systems based on deep learning as these systems reason by statistical correlation, not by causal understanding.
  • This is what leads to the errors, irrationality, craziness and hallucinations that many people have reported and unless something fundamentally changes in how these models are built, these problems will persist.
  • GPT-4 is also unusual in that it had human intervention during its creation and this was done in an attempt to pre-empt bad actors from trying to entice the system into saying socially unacceptable things.
  • The problem here is that GPT-4 has now had bias injected into it as views on what is socially acceptable vary widely and so the very bias that OpenAI claims to try and eradicate is now part of the system by design.
  • GPT-4 has now also been trained with vision and can describe photographs as well as what is odd or strange about them.
  • This represents a sort of generalization as language and vision are currently implemented using two different types of neural network but here OpenAI claims to have managed this with just one.
  • Whether GPT-4 can perform as well as other generative AIs that are specifically designed for generating or recognizing images remains to be seen.
  • For example, Midjourney which was specifically designed for images from the ground up is much better than DALL-E which is based on the large language model, GPT-3.
  • The net result is that I think that GPT-4 is a triumph of effort over finesse and reinforces my view that OpenAI’s philosophical approach to AI remains that with infinite data and infinite compute power, general artificial intelligence will magically appear.
  • I see this as an iteration of the infinite monkey theorem which states that if you have enough monkeys and enough time, they will eventually produce the works of William Shakespeare.
  • Unfortunately, despite a massive increase in monkeys, there is no sign of any of the famous plays.
  • This also raises the likelihood that we are fast reaching the point of diminishing returns in deep learning where more and more effort is required to produce smaller and smaller improvements.
  • The improvement of GPT-4 over GPT-3 is less than GPT-3 over GPT-2 and the likelihood is that a much greater increase in resources was required to produce it.
  • Furthermore, there is nothing in GPT-4 that leads me to think that general artificial intelligence is any closer, leaving the industry in need of other techniques to solve some of the more difficult problems of AI like autonomous driving.
  • Here, I continue to think that a combination of rules-based software and small specific neural networks each carrying out a small simple task is the way that these problems will be solved in a practical way.
  • There has been some progress on this front (see here) but this approach still remains some way away from a practical and commercial application.
  • In the meantime, OpenAI and others will continue to fuel the hype and expectations (artificial general intelligence) until such time that expectations are not met.
  • This will result in disappointment, disillusionment, falling investment and lower valuations as it has on three separate occasions in the last 70 years.
  • In short, the 4th AI winter.

RICHARD WINDSOR

Richard is founder, owner of research company, Radio Free Mobile. He has 16 years of experience working in sell side equity research. During his 11 year tenure at Nomura Securities, he focused on the equity coverage of the Global Technology sector.