Artificial Intelligence – Catch-up.

UC Irvine treads in Deep Mind’s footsteps.

  • The University of California has created an algorithm that can solve Rubik’s cube from scratch, but I think that this achievement pales when compared to what DeepMind already achieved some time ago.
  • The Rubik cube is a great task to choose as it has flummoxed humans for two generations making for the widespread understanding of the problem when publicised.
  • Regular algorithms already exist than can solve the cube in less than one second, but the aim of this research was to create an algorithm that would learn how to do it itself from first principles.
  • This is called reinforcement learning and is exactly the technique DeepMind used in 2015 to teach a machine to play Brick-breaker and AlphaGo to play Go in 2016.
  • Compared to Go, the Rubik Cube is a trivial problem with 4.3 x 1019 (see here) possible combinations compared to a regular 19×19 Go board which has 2.1 x 10170 legal positions (see here).
  • This is an order of complexity far greater than the human mind can possibly envisage and the Rubik’s cube is a rounding error of simplicity compared to it.
  • The researchers also trained the algorithm to solve a 48 (7×7) cube which has a total of 3.0 x 1062 combinations, but this is still around 1 x 10100 times simpler than the Go problem.
  • However, the beauty of the Rubik’s cube is that it is known to humans as a difficult task and is also a task where the data set is both very finite and very stable.
  • This makes it a perfect task to address with a deep learning algorithm (see here) which the researchers have used in conjunction with reinforcement learning (see here).
  • The algorithm was capable of solving all iterations of the puzzle 100% of the time when self-trained from scratch knowing only the rules.
  • Furthermore, around 60% of the time, the solution it arrived at was the fastest possible solution.
  • Like DeepMind, the researchers were able to take the same starting algorithm and successfully train it on different versions of the puzzle allowing the authors to speculate that the untrained algorithm may have some other uses outside of Rubik’s puzzles.
  • This is, of course, a pot shot at RFM’s Goal No. 2 of AI: transfer learning (see here) which is the ability to take what one has learned from one task and apply it to another.
  • I have long believed that this is the most fundamental and critical problem that limits advances in AI today.
  • Deep Mind made a similar claim in 2017 (see here) with AlphaGo Zero which RFM concluded was similarly misleading.
  • This is because, in both instances, the algorithm is not taking what it has learned and applying it to a new task but merely has the ability to be trained from a generic starting point to solve slightly different tasks.
  • One still ends up with one algorithm per task which I do not consider to be a solution to the very perplexing transfer learning problem.
  • The net result is that this paper is good news because it shows that the rest of the community is beginning to catch-up with Deep Mind and Google opening the scope for its lead to be shortened.
  • I still see a big slow down in AI progress caused by the limitations of deep learning (see here) leading to a narrowing of the field and the erosion of the lead of the current front runners, Google, Yandex and Baidu.
  • The slower progress of the leaders and this paper are a sign that this is beginning to happen.
  • The third AI winter beckons.

RICHARD WINDSOR

Richard is founder, owner of research company, Radio Free Mobile. He has 16 years of experience working in sell side equity research. During his 11 year tenure at Nomura Securities, he focused on the equity coverage of the Global Technology sector.