What AI Is Missing for Real Reasoning? Axiom Math’s Carina Hong on how to build an AI mathematician

T
Turing Post Nov 28, 2025

Audio Brief

Show transcript
This episode explores why Large Language Models struggle with mathematical reasoning and Axiom Math's strategy to build a self-improving AI mathematician for scientific discovery. There are three key takeaways from this conversation. First, true AI reasoning demands moving beyond next-token prediction to verifiable, step-by-step logical deduction. Second, a self-improving AI for scientific discovery requires a cyclical system where an AI proposes new questions, another solves them, continually expanding a verified knowledge base. Third, advanced AI may emerge from specialized, superintelligent models in core domains like mathematics, acting as reasoning engines for other scientific fields. Current Large Language Models, trained on next-token prediction, mimic plausible text but fundamentally lack genuine step-by-step logical reasoning. This leads to errors in both basic arithmetic and complex proofs. A promising solution involves a neuro-symbolic hybrid approach, combining LLMs' intuitive power with the rigor of formal verification systems. This ensures every reasoning step is verifiable and trustworthy. Axiom Math envisions a self-improving "AI mathematician" built on three pillars. A prover system generates mathematical proofs. A comprehensive knowledge base stores verified mathematical theorems. Finally, a conjecture system proposes new mathematical problems and hypotheses for the prover to tackle. This creates a continuous cycle of discovery and learning. The shift is from general AI to developing domain-specific superintelligences. Mathematical reasoning is seen as the algorithmic foundation for broader "AI for Science." By establishing a robust reasoning platform in mathematics, AI can drive radical breakthroughs in physics, biology, and other scientific disciplines. This approach highlights the critical need for verifiable AI in complex reasoning and its potential to revolutionize scientific discovery.

Episode Overview

  • Carina Hong, CEO of Axiom Math, discusses why current Large Language Models (LLMs) struggle with mathematical reasoning and the fundamental differences between next-token prediction and true logical deduction.
  • She outlines Axiom Math's vision for creating a self-improving "AI mathematician" built on three pillars: a prover system, a comprehensive knowledge base, and a conjecture system.
  • The conversation highlights the shift from general AI (AGI) to developing domain-specific superintelligences, with mathematical reasoning serving as the algorithmic foundation for broader "AI for Science."
  • Hong emphasizes the importance of a hybrid approach, combining the intuitive power of LLMs with the rigor of formal verification systems like Lean to ensure every step in the reasoning process is verifiable and trustworthy.

Key Concepts

  • Limitations of LLMs in Math: Current models are trained on next-token prediction, which mimics plausible-sounding text but doesn't perform genuine, step-by-step logical reasoning. This leads to errors in basic arithmetic and complex proofs.
  • Formal Verification: The process of using formal methods to prove or disprove the correctness of algorithms or mathematical statements. This is crucial for creating reliable AI systems, as it ensures the entire reasoning process, not just the final answer, is verifiable.
  • Three Pillars of an AI Mathematician: Axiom Math's approach is built on three core components that work in a self-improving loop:
    1. Prover System: An AI that can generate mathematical proofs for given problems.
    2. Knowledge Base: A structured, verifiable database of mathematical theorems, definitions, and concepts, akin to a knowledge graph.
    3. Conjecture System: An AI that can propose new, interesting mathematical problems and hypotheses for the prover to tackle.
  • AI for Science: The idea that AI for mathematics can serve as the core algorithmic engine for breakthroughs in other scientific fields. By creating a robust reasoning platform, AI can be applied to complex problems in physics, biology, and other domains.
  • Neuro-Symbolic Hybrid Approach: The strategy of combining neural networks (like LLMs) for their pattern recognition and intuitive capabilities with symbolic systems (like formal verification) for their logical rigor and verifiability. This approach aims to get the best of both worlds for complex reasoning tasks.

Quotes

  • At 01:15 - "Current models are trained by next token prediction that isn't exactly what's going on when you're doing mathematical reasoning." - Explaining the fundamental limitation of LLMs when it comes to solving math problems.
  • At 02:11 - "[Models are] not great at coming up with de novo mathematical knowledge." - Highlighting that current AI is better at retrieving information than creating genuinely new mathematical insights.
  • At 03:36 - "We want all the intermediate steps, the reasoning process to be verifiable as well." - Emphasizing the importance of formal verification to ensure the trustworthiness and correctness of an AI's mathematical proofs.
  • At 07:16 - "We are probably functioning in a space where we can create new knowledge." - Describing the exciting frontier of using AI to make new scientific and mathematical discoveries.
  • At 13:02 - "We think of AI for math as also the algorithmic pillar of AI for science, we make the radical breakthroughs and then real-world testing..." - Positioning AI for math as a foundational technology that will enable advancements across all scientific disciplines.

Takeaways

  • To achieve true AI reasoning, models must move beyond simply predicting the next word and incorporate verifiable, step-by-step logical deduction.
  • Building a self-improving AI for scientific discovery requires a cyclical system where an AI proposes new questions (conjecturer) and another AI works to solve them (prover), continuously expanding a verified knowledge base.
  • The most promising path to advanced AI may not be a single AGI, but rather the development of specialized, superintelligent models focused on core domains like mathematics, which can then serve as a reasoning engine for other scientific fields.