Why the US need Open Models | Nathan Lambert on what matters in the AI and science world
Audio Brief
Show transcript
This episode explores the complex dynamics between open and closed AI development, examining why independent researchers have filled the communication void left by secretive frontier labs and how geopolitical incentives drive model release strategies.
There are four key takeaways from this discussion. First, the primary barrier to replicating top-tier AI is no longer raw computing power but the expensive and complex post-training phase. Second, geopolitical constraints force Chinese labs to release open weights to gain market share, while US labs keep models closed to protect intellectual property. Third, companies must evaluate the total cost of ownership when choosing between open models and closed APIs, as open source often entails hidden infrastructure costs. Finally, the industry is shifting toward hybrid architectures that combine Transformers with efficient state-space models like Mamba.
The conversation begins by addressing the communication gap in AI research. Top scientists at labs like OpenAI and Google are increasingly silenced by non-disclosure agreements, creating a vital ecosystem for independent researchers to translate complex advancements for the public. While closed frontier models currently lead in performance benchmarks, open models remain the essential platform for the scientific community to study safety and alignment. However, a significant moat has emerged in the form of post-training. This phase, which involves reinforcement learning from human feedback and generating high-quality synthetic data, shapes a model's utility and personality. The financial cost of this process is massive, creating a persistent lag between open source projects and closed frontier models.
Geopolitics plays a surprising role in how these models are released. Nvidia releases open models to drive hardware demand, while Chinese labs release open weights out of necessity. Western enterprises cannot legally send data to Chinese APIs, so to compete globally, Chinese firms must offer local, private solutions via open weights. Conversely, US labs lock down their models to maintain competitive advantage. This leads to a distinct vendor lock-in effect, where a model's specific interaction style becomes the product itself. Users build habits around the phrasing and personality of a specific model, creating friction when attempting to switch providers.
For enterprises, the choice between open and closed systems is often misunderstood. While open weights are ostensibly free, the infrastructure required to host them effectively can be more expensive than paying for a closed API. The real friction for adoption is often the engineering harness rather than the model's intelligence. Furthermore, developers are advised to anticipate a six to nine month lag where the best open models trail closed frontier systems. The smart strategy is often to prototype with bleeding-edge closed models and optimize costs later with open solutions once they catch up.
Looking toward the technical horizon, the discussion highlights the move away from pure Transformers toward hybrid architectures. These new designs integrate efficient state-space models like Mamba with traditional Transformer layers. This combination aims to reduce inference costs and improve speed without sacrificing the generation quality that pure Mamba models sometimes lack. For engineers, this signals a need to upskill on these hybrid systems, as they represent the future of efficient, long-context AI applications.
Ultimately, understanding the true costs of post-training and the geopolitical strategy behind release cycles is essential for navigating the current AI landscape.
Episode Overview
- Understanding the "Communication Void": Explores why top AI researchers at frontier labs (OpenAI, Google) are silent due to NDAs, and how this has created a vital role for independent researchers to translate complex advancements for the public.
- The Strategic Battle of Open vs. Closed AI: Analyzes the geopolitical and economic incentives driving model releases, explaining why Nvidia and Chinese labs push for open source while US frontier labs lock down their intellectual property.
- The "Post-Training" Moat: Reveals that the true barrier to entry in AI isn't just raw computing power, but the incredibly expensive and complex "post-training" phase (RLHF, synthetic data) that shapes a model's personality and utility.
- Future Architectures and Enterprise Reality: Discusses the shift toward hybrid models (mixing Transformers with Mamba), the economic imperative of continual learning, and why enterprises often choose closed APIs over open models despite the hype.
Key Concepts
- Open Models as the Research Engine: While closed frontier models (like GPT-4o) currently lead in performance, open models serve as the essential platform for academic and independent research. Without open weights, the scientific community cannot effectively study safety, alignment, or interpretability.
- The "Post-Training" Moat: The hardest part of replicating top-tier AI is no longer pre-training (feeding text to a model). It is the capital-intensive post-training phase—purchasing high-quality human data and generating synthetic data for instruction tuning. This financial barrier, rather than a talent gap, creates the performance lag between open and closed models.
- The Geopolitics of Release Strategies:
- Nvidia: Releases open models to drive demand for GPUs (hardware business model).
- Chinese Labs: Release open weights because Western companies cannot legally send data to Chinese APIs. To gain market share, they must offer local, private solutions.
- US Frontier Labs: Keep weights closed to protect IP and maintain competitive advantage.
- Model Personality as "Vendor Lock-in": Post-training creates a specific "personality" or interaction style for a model. This acts as a soft lock-in because the "language is the interface." Users build habits around how a specific model (like Claude vs. GPT) phrases things, creating friction when trying to switch providers.
- Hybrid Architectures: The industry is moving away from pure Transformers toward "hybrid models" that integrate efficient architectures like Mamba (state-space models). While Mamba failed to replace Transformers initially due to poor generation quality, combining it with Transformer layers creates models that are cheaper and faster to run without sacrificing quality.
- Infrastructure over Ideology: The primary friction for enterprise adoption of open source isn't the model intelligence, but the "harness." It is often cheaper and easier for companies to pay for a closed API than to build the expensive engineering team and compute infrastructure required to host "free" open weights effectively.
Quotes
- At 1:29 - "A lot of these communicators... can't talk. So that's this void that I have been launched into... largely due to the dynamics of the AI industry." - Explaining why independent researchers have become the primary educators as lab scientists go quiet.
- At 5:40 - "Tech companies and potential customers in the US won't sign up for an API where data is sent to China... Our only other option is open weights." - Revealing the strategic necessity behind why Chinese labs are leading the charge in open source releases.
- At 11:17 - "I think pre-training data is the hardest legal part to get open... Post-training tends to be fairly complex at the frontier with a lot of models and a lot of sequencing... hard decision making." - Differentiating the legal barriers of data collection vs. the technical barriers of fine-tuning.
- At 14:53 - "If even AI2 is spending millions of dollars on effective compute for synthetic data... you're going to guess that the compute expenditure there is almost billions at these frontier labs." - Highlighting the massive financial barrier that prevents open source projects from matching closed labs.
- At 19:11 - "The Mamba models did really well on pre-training benchmarks, but the actual text generation from them wasn't nice... [It required] balancing them better with traditional transformer-style architectures." - Explaining why new architectures rarely replace old ones immediately; they usually evolve into hybrid solutions.
- At 25:06 - "In enterprise situations, lawyers do not like vague, extra terms attached. A lot of people that end up writing these custom AI licenses do it in ways that [create] exposure of risk." - Why "open weights" with restrictive licenses often fail in business compared to truly permissive licenses like Apache 2.0.
- At 31:18 - "Therefore the model's language is the interface by which the user sees the information... the model's output is so literally the product." - A crucial insight into why post-training creates competitive moats: you aren't just selling intelligence, you are selling a specific user experience.
Takeaways
- Evaluate Total Cost of Ownership, Not Just API Fees: When choosing between open and closed models for business, calculate the cost of the engineering team and infrastructure required to host open models. Often, the "free" model is more expensive than the paid API for non-tech enterprises.
- Anticipate a 6-9 Month "Open" Lag: Plan your development cycles understanding that the best open-source models will consistently trail closed frontier models by roughly 9 months. Use closed models for prototyping bleeding-edge features and open models for cost-optimization later.
- Focus on Data Licensing Risks: Be wary of "open weights" that come with restrictive custom licenses. If you are building for enterprise, favor models with clear Apache 2.0 licenses (often coming from Chinese labs or established players like Meta) to avoid legal exposure.
- Prepare for Hybrid Models: If you are an ML engineer, start upskilling on hybrid architectures (Transformer + Mamba/Linear Attention). The future of efficient inference lies in these combinations, particularly for long-context applications where standard Transformers become too expensive.