There's a mysterious list of research papers that Ilya Sutskever reportedly gave to John Carmack in 2020. While everyone talks about it, no one has ever seen it. Hereβs the story, an update on it, and the purported list β
John Carmack, the renowned game developer, rocket engineer, and VR visionary, shared in an interview that he asked Ilya Sutskever, OpenAI co-founder and former Chief Scientist, for a reading list about AI. Ilya responded with a list of approximately 40 research papers, saying:
If you really learn all of these, youβll know 90% of what matters today.
This elusive list became a topic of search and discussion, amassing 131 comments on Ask HN. So many people wanted it that Carmack posted on Twitter, expressing his hope that Ilya would make it public and noting that βa canonical list of references from a leading figure would be appreciated by manyβ:
We agree. However, Ilya has yet to publish such a list, leaving us to speculate. Recently, an OpenAI researcher reignited the conversation by claiming to have compiled this list, and the post went viral.
Hereβs what was inside (grouped for your convenience)
Core Neural Network Innovations
Recurrent Neural Network Regularization - Enhancement to LSTM units for better overfitting prevention.
Pointer Networks - Novel architecture for solving problems with discrete token outputs.
Deep Residual Learning for Image Recognition - Improvements for training very deep networks through residual learning.
Identity Mappings in Deep Residual Networks - Enhancements to deep residual networks through identity mappings.
Neural Turing Machines - Combining neural networks with external memory resources for enhanced algorithmic tasks.
Attention Is All You Need - Introducing the Transformer architecture solely based on attention mechanisms.
Specialized Neural Network Applications
Multi-Scale Context Aggregation by Dilated Convolutions - A convolutional network module for better semantic segmentation.
Neural Machine Translation by Jointly Learning to Align and Translate - A model improving translation by learning to align and translate concurrently.
Neural Message Passing for Quantum Chemistry - A framework for learning on molecular graphs for quantum chemistry.
Relational RNNs - Enhancement to standard memory architectures integrating relational reasoning capabilities.Theoretical and Principled Approaches
Deep Speech 2: End-to-End Speech Recognition in English and Mandarin - Deep learning system for speech recognition.
ImageNet Classification with Deep CNNs - Convolutional neural network for classifying large-scale image data.
Variational Lossy Autoencoder - Combines VAEs and autoregressive models for improved image synthesis.
A Simple NN Module for Relational Reasoning - A neural module designed to improve relational reasoning in AI tasks.
Theoretical Insights and Principled Approaches
Order Matters: Sequence to sequence for sets - Investigating the impact of data order on model performance.
Scaling Laws for Neural LMs - Empirical study on the scaling laws of language model performance.
A Tutorial Introduction to the Minimum Description Length Principle - Tutorial on the MDL principle in model selection and inference.
Keeping Neural Networks Simple by Minimizing the Description Length of the Weights - Method to improve neural network generalization by minimizing weight description length.
Machine Super Intelligence DissertationMachine Super Intelligence Dissertation - Study on optimal behavior of agents in computable environments.
PAGE 434 onwards: Komogrov Complexity - Comprehensive exploration of Kolmogorov complexity, discussing its mathematical foundations and implications for fields like information theory and computational complexity.
Interdisciplinary and Conceptual Studies
Quantifying the Rise and Fall of Complexity in Closed Systems: The Coffee Automaton - Study on complexity in closed systems using cellular automata.
Efficiency and Scalability Techniques
GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism - A method for efficient training of large-scale neural networks.
Educational Materials and Tutorials
CS231n: Convolutional Neural Networks for Visual Recognition - Stanford University course on CNNs for visual recognition.
The Annotated Transformer - Annotated, line-by-line implementation of the Transformer paper. Code is available here.
The First Law of Complexodynamics - Blog post discussing the measure of system complexity in computational terms.
The Unreasonable Effectiveness of RNNs - Blog post demonstrating the versatility of RNNs.
Understanding LSTM Networks - Blog post providing a detailed explanation of LSTM networks.
Thanks for reading!