LAST CALL for questions for our big 2024 recap episode! Submit questions and messages on Speakpipe here for a chance to appear on the show! We record Monday.
It is remarkable how many AI professionals don’t take AI seriously — by which we mean fully thinking through the implications of scale and trendlines, and aligning their investments and actions accordingly with accelerating AI progress as the baseline. This is the main risk of the AI Engineer trying to plug current model capability gaps - trying to human-engineer their way out of something that should be or will soon be solved by machine-learning1.
Properly establishing a mental framework/process for dealing with where ML training ends and AI engineering begins is of utmost interest to us. There is one person who many have credited for feeling the AGI before anyone else: Ilya Sutskever.
This week we got not one but TWO insights into the insights of Ilya: his widely publicized Test of Time talk at NeurIPS (transcript here), and OpenAI’s voluntary disclosure of his 2016 emails for the ongoing lawsuit with Elon. This gives us Ilya checkpoints for 2014, 2016, and 20242.
We analyze each in time, ending with a final speculative conversation about 2023.
What Ilya Saw in 2014
The relevant parts from his 2014 NIPS talk on sequence-to-sequence learning:
The Deep Learning Hypothesis - “if you have a large neural network, it can do anything a human can do in a fraction of a second”
The Autoregression Hypothesis - the simple next token prediction/sequence-to-sequence task would grasp the correct distribution to generalize from translation to everything else.
The Scaling Hypothesis - “If you have a large big dataset, and you train a very big neural network, then success is guaranteed!”
The Connectionism Hypothesis - If you believe that an artificial neuron is like a biological neuron3, then very large neural networks can be “configured to do pretty much all the things that we human beings do”.
With this very selective interpretation of Ilya’s talk, discarding the couple things (LSTMs) that didn’t age well, Ilya has been consistently correct on the big, simple, yet profound insights. We would argue that generalizing insights that are correct on the big picture based on incorrect small details makes them more long-lived/credible.
What Ilya Saw in 2016-2017
OpenAI emails published this week also demonstrate deep beliefs from Ilya:
Keep reading with a 7-day free trial
Subscribe to Latent Space to keep reading this post and get 7 days of free access to the full post archives.