29.4% ARC-AGI-2 🤯 (TOP SCORE!) - Jeremy Berman

Machine Learning Street Talk Machine Learning Street Talk • Sep 27, 2025

Audio Brief

Show transcript
This episode features Jeremy Berman, winner of the ARC-AGI V2 leaderboard, discussing his innovative evolutionary algorithm for abstract reasoning using natural language. There are four core takeaways from this conversation. First, for abstract reasoning, natural language can be a more flexible and powerful programming paradigm than traditional code. Second, the true goal of AGI research is the "meta-skill" of reasoning: the ability to learn how to learn. Third, current deep learning training methods like Stochastic Gradient Descent have inherent limitations that may prevent them from discovering general, causal models. Fourth, a fundamental shift is needed from associative knowledge, a "web," to deductive, principled understanding of the world, a "tree." Berman's ARC-AGI winning approach uses plain English descriptions as an expressive "programming language" to solve abstract visual puzzles. This method contrasts with generating brittle Python code, demonstrating language's efficiency for describing solutions. The ARC-AGI challenge itself tests an AI's ability to generalize from few examples, a key weakness of current models. True Artificial General Intelligence is defined not by accumulating many skills, but by possessing the foundational ability to acquire new skills efficiently. This "meta-skill" is reasoning. It highlights a philosophical distinction in AI development, emphasizing the process of learning over merely learned outcomes. Stochastic Gradient Descent, the standard deep learning training method, is argued to be biased toward finding "shortcuts." This limitation suggests SGD may be incapable of discovering the truly general, factored algorithms required for robust, human-like reasoning. It calls into question the capacity of current AI paradigms to achieve fundamental understanding. Pre-training often treats knowledge as an associative "web," where embeddings are connected without guaranteed causal relationships. A more profound shift is needed towards building structured, deductive, and causal understanding from first principles, metaphorically described as a "knowledge tree." This move is crucial for developing AI with genuine understanding. Ultimately, the path to true Artificial General Intelligence requires rethinking fundamental AI design principles, from language as code to the very nature of learning and knowledge representation.

Episode Overview

  • This episode features an interview with Jeremy Berman, the winner of the ARC-AGI V2 leaderboard, who discusses his innovative evolutionary algorithm for abstract reasoning.
  • The core of Berman's approach is using natural language (English) as a more expressive "programming language" to describe puzzle solutions, rather than generating brittle Python code.
  • The conversation explores the limitations of current AI, including catastrophic forgetting and the biases of training methods like Stochastic Gradient Descent (SGD).
  • It delves into the philosophical nature of intelligence, defining reasoning as the "meta-skill" of acquiring new skills, and contrasting associative knowledge (a "web") with deductive understanding (a "tree").

Key Concepts

  • The ARC-AGI Challenge: An intelligence test for AI that uses abstract visual puzzles to measure an AI's ability to generalize from a few examples, a key weakness of current models.
  • Language as a Program: The central idea that for abstract reasoning tasks, natural language is more expressive and efficient for describing solutions than a formal programming language like Python.
  • Evolutionary Algorithm: Berman's winning method involves generating and refining plain English descriptions of puzzle solutions over many generations, selecting for the most effective ones.
  • Reasoning as a Meta-Skill: The concept that true AGI is not about having many skills, but possessing the foundational ability to acquire new skills efficiently, which is defined as reasoning.
  • Inductive vs. Transductive Methods: A distinction between approaches that generate an intermediate executable program (inductive) and those that directly transform instructions into a final solution (transductive), which was Berman's method.
  • Limitations of SGD: The argument that Stochastic Gradient Descent, the standard deep learning training method, is biased toward finding "shortcuts" and may be incapable of discovering the truly general, factored algorithms required for robust reasoning.
  • Knowledge Web vs. Knowledge Tree: A metaphor distinguishing the associative, interconnected knowledge from pre-training (web) from the structured, deductive, and causal understanding built from first principles that constitutes true reasoning (tree).

Quotes

  • At 0:49 - "That's why I switched from Python to English, which is a much more expressive program." - Berman provides the core reason for changing his approach from evolving code to evolving natural language descriptions.
  • At 0:55 - "It's the meta skill, it's the skill to create the skills that is AGI. And to me, that's reasoning." - Berman offers his definition of reasoning as the foundational capability for achieving general intelligence.
  • At 28:04 - "[My co-host] says that... stochastic gradient descent does not find the algorithms that allow the systems to behave as if they are Turing machines." - Dr. Tim Scarfe highlights a fundamental limitation of current training methods, suggesting they are not designed to discover the kind of complex, general algorithms that enable true reasoning.
  • At 50:08 - "Pre-training treats all knowledge as a knowledge web... it's embeddings that are connected, but there's no guarantee that you have the correct causal relationship between things." - Jeremy Howard explains his view of pre-training as creating an associative network of knowledge, which lacks the deductive structure of true reasoning.
  • At 57:08 - "Ablating at will pre-trained data and then... building an environment to have the model regenerate that information." - Jeremy Howard describes his "earworm" idea: forcing models to rediscover fundamental principles from scratch to build a true reasoning circuit.

Takeaways

  • For problems involving abstract reasoning, natural language can serve as a more flexible and powerful programming paradigm than traditional, rigid code.
  • The true goal of AGI research should be to develop the "meta-skill" of reasoning—the ability to learn how to learn—rather than simply accumulating more specific skills.
  • Current deep learning training methods like SGD have inherent limitations that may prevent them from discovering the general, causal models required for true understanding.
  • A fundamental shift in AI development is needed to move from associative knowledge (a "web") to a deductive, principled understanding of the world (a "tree").