Key thesis: The next phase of AI is not defined by scaling parameters, but by architectural structure—specifically world models, energy-based learning, and agentic systems.


For much of the past decade, progress in artificial intelligence followed a simple rule: scale works.
More parameters, more data, more compute translated into broader capabilities and increasingly general behavior. Large language models became the visible peak of this paradigm.

Yet inside the field, a quieter discussion has been unfolding—one focused not on performance gains, but on architectural ceilings. Among the most consistent voices in this discussion is Yann LeCun, whose critique of large language models is neither reactionary nor dismissive. It is structural.

LeCun’s position does not argue that LLMs have failed. It argues that they have succeeded within a narrow design space—and that this space is insufficient for the next phase of intelligence.


The Scaling Paradigm and Its Structural Boundaries

Scaling has delivered undeniable results. GPT-class models demonstrated emergent reasoning, abstraction, and tool use that few expected from statistical systems trained on text alone. Reinforcement learning, instruction tuning, and scaffolding further extended these capabilities.

However, scaling also revealed constraints that do not soften with more compute:

From a systems perspective, these are not performance issues. They are design constraints.

Empirically, the cost curve reinforces this view. Training runs that cost single-digit millions of dollars two years ago now operate at order-of-magnitude higher expense, while gains on planning-heavy benchmarks plateau. Scaling still improves fluency and coverage, but struggles to deliver proportional advances in grounding, memory, and causal reasoning.

This is the context in which post-scaling architectures become relevant—not as replacements, but as structural extensions.


World Models: From Prediction to Internal Simulation

LeCun’s alternative centers on world models: internal representations that allow systems to simulate environments, reason about consequences, and plan actions before execution.

Unlike LLMs, which learn statistical regularities in language, world models aim to encode:

The distinction is fundamental. A system with a world model does not merely predict what comes next—it evaluates what could happen under different actions.

This approach has moved beyond theory. Several major research efforts point in this direction:

These efforts share a common assumption: intelligence requires an internal model of reality that persists beyond immediate input.


Energy-Based Learning and Constraint-Oriented Reasoning

A second architectural pillar in LeCun’s framework is energy-based learning. Instead of producing outputs directly, energy-based models assign scores to configurations, favoring coherent states over inconsistent ones.

This reframes reasoning as constraint satisfaction rather than sequence generation.

The implications are significant:

While computationally demanding, energy-based approaches align more closely with how complex systems—biological or engineered—resolve ambiguity. They trade speed for structure, a trade-off that becomes increasingly relevant as AI systems move into physical and strategic domains.


Agentic Architectures as the Emerging Synthesis

In parallel, industry practice has begun converging on agentic architectures. These systems combine multiple components—models, memory, planners, and tools—into coordinated loops capable of long-horizon behavior.

In this context, LLMs persist, but their role changes. They function as:

rather than as monolithic intelligence engines.

This shift reflects a broader realization: intelligence is not a single model, but an orchestrated system. The move from monoliths to agents mirrors earlier transitions in computing—from standalone programs to distributed systems, from pipelines to adaptive networks.


The Strongest Case for Scaling—and Its Limits

To understand why post-scaling architectures matter, it is necessary to acknowledge the strongest counterargument.

Scaling skeptics have been wrong before. Larger models repeatedly surprised critics, and techniques such as RLHF, tool use, and scaffolding significantly extended LLM capabilities. In many domains, these systems are “good enough” to replace human workflows.

However, these extensions rely on external structure. Planning is often delegated to heuristics. Memory is bolted on rather than integrated. Grounding is approximated through data rather than interaction.

The result is functional intelligence without internal coherence. Systems perform well until conditions change, goals extend, or environments resist textual abstraction.

This is not a failure of engineering effort. It is a signal that scaling alone addresses surface capability, not underlying structure.


Timeline: Signals Rather Than Predictions

The transition beyond pure scaling is unlikely to be abrupt. Instead, it is unfolding in phases:

This is not a forecast of general intelligence, but a structural realignment driven by constraints rather than ambition.


From Scale to Structure

The first phase of modern AI emphasized representation.
The second prioritized scale.
The next phase appears increasingly defined by structure.

World models, energy-based learning, and agentic architectures are not speculative alternatives. They are responses to architectural limits that scaling alone cannot resolve.

LeCun’s critique, once seen as contrarian, now reads as early alignment with a system-level transition already underway.