Artificial Intelligence – Generic origami pt. III

AlphaFold almost fulfils its promise.

  • DeepMind has expanded a crucial biological database by 200x confirming that an entire branch of scientific enquiry is now virtually obsolete.
  • AlphaFold is DeepMind’s algorithm that is capable of predicting the physical structure of proteins based purely on their amino acid sequence which was interesting in 2018 (see here), impressive in 2020 (see here) and is now in use with great potential.
  • Proteins are the exclusive product of DNA and as such are responsible for 100% of all functions and features that result from an organism’s genetic make-up.
  • In software terms, DNA is the lines of code while the protein is the app itself which underlines how crucial understanding their structure is.
  • Proteins are chains of amino acids which, after they have been synthesized from DNA, are folded by the electrostatic interactions between the amino acids to make the shape which enables their function.
  • Protein structure is crucial because it is, in effect, the execution of instructions that are encoded on the DNA strand.
  • However, because there are 20 different amino acids that can be used and because proteins can be many hundreds of amino acids long, the number of possible structures that can be formed from one chain is practically infinite.
  • Hence, calculating each possibility (as a regular computer would do (brute force)) is impossible as Cyrus Levinthal calculated in 1969 that a protein with 100 amino acids has 3198 possible combinations (see here).
  • The traditional way to work out a protein’s structure is to make a very pure crystal of the protein and bombard it with x-rays which are deflected when they hit atoms creating a pattern from which the structure can be deduced.
  • Understanding a protein’s structure is crucial to understanding its function and as such plays a role in diagnosis, treatment, drug discovery, food production, vaccine development and so on.
  • The most famous use of this technique was the discovery of the double-helix structure of DNA in the 1950s predicted by Watson and Crick and proved using this technique by Rosalind Franklin and Maurice Wilkins.
  • This technique is very expensive and very slow which is why to date only around 190,000 protein structures have been solved this way in over 70 years of effort.
  • Protein structures are stored in a public database known as UniProt (see here) which will now be expanded from 1m structures to 200m structures using AlphaFold’s predictions.
  • This means that most proteins that researchers look into will now have a predicted structure that is likely to be extremely accurate from the minute they start working with it.
  • It also raises the possibility for the database to be searched for certain types of structure or functionality which could reduce discovery time by decades.
  • AlphaFold is still far from perfect, and proteins are still going to have their structures checked in the traditional, way but this will be much easier and faster when one knows with 90%+ accuracy what one is looking for.
  • I suspect that the requirement to check the structure using x-ray crystallography will rapidly disappear as AlphaFold’s accuracy improves with time and it is proven against the tried and tested method.
  • This is a great example of just how useful AI can be but also a reminder of its limitations.
  • Protein folding is an extremely complex system of electrostatic interactions but nonetheless, it is a system that is both finite where the rules of the system are well defined and do not change.
  • This makes protein folding an ideal candidate for deep learning but clearly indicates that tasks like driving vehicles and human conversation remain far out of reach.
  • This is because these tasks are virtually infinite in their scope and where the rules of the systems are constantly changing.
  • Hence, no matter how much DeepMind likes to tout its wares, this brilliant breakthrough has no bearing on solving the really difficult problems of AI where progress remains extremely slow.
  • Skynet is not coming for us yet.

RICHARD WINDSOR

Richard is founder, owner of research company, Radio Free Mobile. He has 16 years of experience working in sell side equity research. During his 11 year tenure at Nomura Securities, he focused on the equity coverage of the Global Technology sector.