What can a researcher learn when facing unexpected changes at a critical point in their career?
This career shift in early 2025 gave Tian Yuandong the opportunity to examine his choices through a classic decision-making framework. When invited to participate in the large-scale project “Emergency,” this AI scientist, with long-term focus on reinforcement learning research, pre-drew a 2x2 matrix outlining four possible outcomes. But reality presented him with a fifth—an outcome beyond expectations.
This surprise deepened his understanding of social complexity. However, during those months of work, his team made breakthroughs on core reinforcement learning issues: training stability, interaction between training and inference, model architecture design, coupling pretraining and intermediate training, long-chain reasoning algorithms, data generation methods, post-training framework design, and more. These achievements brought a significant paradigm shift to his subsequent research directions.
Tian Yuandong admits that leaving a big company had been a long-standing consideration. Over his more than ten-year career, he had thought about quitting multiple times—almost acted on it at the end of 2023—but economic and family factors repeatedly made him change his mind. In recent years, he jokingly said that his words and actions seemed to be “hinting” that the company should let him go. This time, he was finally “helped” into making a decision.
Interestingly, this “zigzag” life trajectory became a source of his creativity. As the old saying goes: “If official channels are blocked, poets benefit; the richer the life experience, the deeper the poetry.” An overly smooth life, in fact, lacks the tension that life itself provides.
He also recalls that early 2021, after writing a few lines reflecting on “why a paper was not accepted” in his annual review, he received some not-so-friendly feedback. But he chose to remain silent, even pretending to have just been promoted in front of others. Six months later, the strategy paid off: he was indeed promoted. And that work, initially ignored in early 2021, won the ICML Best Paper Award in July, becoming a classic in the field of representation learning.
After October 22, all his communication channels temporarily collapsed—hundreds of messages, emails, and meeting invites flooded in daily. It took weeks to return to normal life. Thanks to everyone’s concern during that time, though he admits some messages may not have been responded to promptly.
Eventually, among invitations from several top tech companies, he chose to join a new startup as a co-founder. Details are temporarily confidential; he prefers to focus on work rather than disclose prematurely.
Research Roadmap for 2025: Three Main Lines
Tian Yuandong’s research plan is very clear: Large Model Reasoning Efficiency and Model Interpretability.
Continuous Latent Space Reasoning with Diffusion
The work on continuous latent space reasoning (coconut, COLM’25), released at the end of 2024, resonated widely in 2025. The entire research community began exploring: how to apply this idea in reinforcement learning and pretraining? How to optimize training efficiency and computational costs?
Although his team was later reassigned to other projects and couldn’t delve deeper into this line, the direction itself has proven its value. Earlier this year, they published a theoretical analysis paper, 《Reasoning by Superposition》(NeurIPS’25), rigorously clarifying the advantages of continuous latent space reasoning over traditional methods from a mathematical perspective, attracting considerable attention.
Multi-Dimensional Breakthroughs in Reasoning Efficiency
Reducing inference costs of large models is a systematic engineering challenge. Tian Yuandong’s team advanced from multiple dimensions:
Token-level Optimization: Token Assorted (ICLR’25) first learned discrete tokens in latent space (using VQVAE), then mixed these discrete tokens with text tokens during post-training, achieving significant reductions in inference cost while also improving performance.
Confidence-Driven Inference Termination: DeepConf detects the confidence level of each generated token to dynamically decide whether to terminate inference early, greatly reducing the number of tokens consumed during inference. In many voting scenarios, performance even surpasses previous methods.
Parallel Inference Chain Training Acceleration: ThreadWeaver creates parallel inference chains, and through post-training, co-optimizes these chains to speed up the entire reasoning process.
Additionally, the team explored reinforcement learning-driven reasoning abilities on small models (Sandwiched Policy Gradient), even enabling complex reasoning in lightweight models like MobileLLM-R1.
Interpretability: From “Why It Works” to “Why It Must Work”
Tian Yuandong’s interest in the Grokking phenomenon (sudden insight) stems from a core confusion two years ago: when analyzing representation learning, he could describe learning dynamics and collapse mechanisms but couldn’t answer a fundamental question—what kind of representations does the model actually learn? How are these representations related to data structures? What level of generalization can they achieve?
The Grokking phenomenon—a sudden shift from memorization to generalization—seems to be a window into this mystery. Early explorations were quite challenging. The 2024 work COGS (NeurIPS’25) could only analyze special cases, which he was not fully satisfied with. After more than a year of reflection and multiple dialogues with GPT, his recent work 《Provable Scaling Laws》 marks a major breakthrough: it can analyze phenomena beyond the reach of linear NTK frameworks and explains the training dynamics behind feature emergence quite well. Though examples remain somewhat special, at least a new window has been opened.
His final piece of the year, 《The path not taken》, is particularly satisfying—it provides a preliminary answer at the weight level, explaining why the behaviors of reinforcement learning and supervised fine-tuning (SFT) differ so greatly.
SFT leads to overfitting and catastrophic forgetting. The superficial reason is the lack of on-policy characteristics in training data; the deeper reason is that external data causes major weight components to change drastically, destroying the “foundation” stability. Reinforcement learning, by using on-policy data, keeps the main weight components unchanged, only modifying minor ones, thus avoiding catastrophic forgetting—and these weight changes are more dispersed (especially under bf16 quantization).
Why Interpretability Is Worth Believing
Many believe that interpretability—i.e., “why AI is so effective”—is not important enough. But for Tian Yuandong, it is a core issue about the future.
Consider two future scenarios:
Scenario 1: If scaling alone achieves AGI or even ASI, then human labor value approaches zero. In this case, AI as a huge black box solves all problems. The most urgent question becomes: How to ensure this superintelligent system always acts benevolently and does not secretly deceive or do harm? The answer inevitably involves interpretability research.
Scenario 2: If the scaling path hits a bottleneck and humans cannot meet exponential resource demands, we must find alternative approaches. Then, understanding “why models are effective and what causes their failure” becomes essential. Interpretability research is the foundation of this alternative path.
In either scenario, interpretability is key to solving the puzzle. Even if AI is omniscient and benevolent, human nature will drive us to explore why it can do so. After all, the “black box” itself breeds chains of suspicion.
In an era where large models reach or surpass human average levels, the “Dark Forest” law from The Three-Body Problem might manifest in another form. For now, opening the black box of trained models and understanding their internal circuits remains an initial task.
The real challenge of interpretability lies in: Starting from first principles—i.e., from model architecture, gradient descent, and the intrinsic structure of data—explaining why models converge to those disentangled, sparse, low-rank, modular, and compositional features. Why do so many equivalent explanations exist? What hyperparameters trigger the emergence of these structures? How are they interconnected?
When we can directly derive the inevitability of feature emergence in large models from the gradient descent equations, interpretability elevates from biological “evidence collection” to physical “principle deduction,” guiding practice and opening new paths for next-generation AI design.
To make an analogy with physics from four centuries ago: back then, we had Tycho Brahe (data gatherer in AI), some Keplers (hypothesis proposers), but no Newton (principle discoverer). When that moment arrives, the world will be unrecognizable.
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
The Fork in the Road in 2025: An AI Researcher's Yearly Reflection (Part One)
Source: Xinzhiyuan | Editor: Taozi
Clear Choices Behind Career Transitions
What can a researcher learn when facing unexpected changes at a critical point in their career?
This career shift in early 2025 gave Tian Yuandong the opportunity to examine his choices through a classic decision-making framework. When invited to participate in the large-scale project “Emergency,” this AI scientist, with long-term focus on reinforcement learning research, pre-drew a 2x2 matrix outlining four possible outcomes. But reality presented him with a fifth—an outcome beyond expectations.
This surprise deepened his understanding of social complexity. However, during those months of work, his team made breakthroughs on core reinforcement learning issues: training stability, interaction between training and inference, model architecture design, coupling pretraining and intermediate training, long-chain reasoning algorithms, data generation methods, post-training framework design, and more. These achievements brought a significant paradigm shift to his subsequent research directions.
Tian Yuandong admits that leaving a big company had been a long-standing consideration. Over his more than ten-year career, he had thought about quitting multiple times—almost acted on it at the end of 2023—but economic and family factors repeatedly made him change his mind. In recent years, he jokingly said that his words and actions seemed to be “hinting” that the company should let him go. This time, he was finally “helped” into making a decision.
Interestingly, this “zigzag” life trajectory became a source of his creativity. As the old saying goes: “If official channels are blocked, poets benefit; the richer the life experience, the deeper the poetry.” An overly smooth life, in fact, lacks the tension that life itself provides.
He also recalls that early 2021, after writing a few lines reflecting on “why a paper was not accepted” in his annual review, he received some not-so-friendly feedback. But he chose to remain silent, even pretending to have just been promoted in front of others. Six months later, the strategy paid off: he was indeed promoted. And that work, initially ignored in early 2021, won the ICML Best Paper Award in July, becoming a classic in the field of representation learning.
After October 22, all his communication channels temporarily collapsed—hundreds of messages, emails, and meeting invites flooded in daily. It took weeks to return to normal life. Thanks to everyone’s concern during that time, though he admits some messages may not have been responded to promptly.
Eventually, among invitations from several top tech companies, he chose to join a new startup as a co-founder. Details are temporarily confidential; he prefers to focus on work rather than disclose prematurely.
Research Roadmap for 2025: Three Main Lines
Tian Yuandong’s research plan is very clear: Large Model Reasoning Efficiency and Model Interpretability.
Continuous Latent Space Reasoning with Diffusion
The work on continuous latent space reasoning (coconut, COLM’25), released at the end of 2024, resonated widely in 2025. The entire research community began exploring: how to apply this idea in reinforcement learning and pretraining? How to optimize training efficiency and computational costs?
Although his team was later reassigned to other projects and couldn’t delve deeper into this line, the direction itself has proven its value. Earlier this year, they published a theoretical analysis paper, 《Reasoning by Superposition》(NeurIPS’25), rigorously clarifying the advantages of continuous latent space reasoning over traditional methods from a mathematical perspective, attracting considerable attention.
Multi-Dimensional Breakthroughs in Reasoning Efficiency
Reducing inference costs of large models is a systematic engineering challenge. Tian Yuandong’s team advanced from multiple dimensions:
Token-level Optimization: Token Assorted (ICLR’25) first learned discrete tokens in latent space (using VQVAE), then mixed these discrete tokens with text tokens during post-training, achieving significant reductions in inference cost while also improving performance.
Confidence-Driven Inference Termination: DeepConf detects the confidence level of each generated token to dynamically decide whether to terminate inference early, greatly reducing the number of tokens consumed during inference. In many voting scenarios, performance even surpasses previous methods.
Parallel Inference Chain Training Acceleration: ThreadWeaver creates parallel inference chains, and through post-training, co-optimizes these chains to speed up the entire reasoning process.
Additionally, the team explored reinforcement learning-driven reasoning abilities on small models (Sandwiched Policy Gradient), even enabling complex reasoning in lightweight models like MobileLLM-R1.
Interpretability: From “Why It Works” to “Why It Must Work”
Tian Yuandong’s interest in the Grokking phenomenon (sudden insight) stems from a core confusion two years ago: when analyzing representation learning, he could describe learning dynamics and collapse mechanisms but couldn’t answer a fundamental question—what kind of representations does the model actually learn? How are these representations related to data structures? What level of generalization can they achieve?
The Grokking phenomenon—a sudden shift from memorization to generalization—seems to be a window into this mystery. Early explorations were quite challenging. The 2024 work COGS (NeurIPS’25) could only analyze special cases, which he was not fully satisfied with. After more than a year of reflection and multiple dialogues with GPT, his recent work 《Provable Scaling Laws》 marks a major breakthrough: it can analyze phenomena beyond the reach of linear NTK frameworks and explains the training dynamics behind feature emergence quite well. Though examples remain somewhat special, at least a new window has been opened.
His final piece of the year, 《The path not taken》, is particularly satisfying—it provides a preliminary answer at the weight level, explaining why the behaviors of reinforcement learning and supervised fine-tuning (SFT) differ so greatly.
SFT leads to overfitting and catastrophic forgetting. The superficial reason is the lack of on-policy characteristics in training data; the deeper reason is that external data causes major weight components to change drastically, destroying the “foundation” stability. Reinforcement learning, by using on-policy data, keeps the main weight components unchanged, only modifying minor ones, thus avoiding catastrophic forgetting—and these weight changes are more dispersed (especially under bf16 quantization).
Why Interpretability Is Worth Believing
Many believe that interpretability—i.e., “why AI is so effective”—is not important enough. But for Tian Yuandong, it is a core issue about the future.
Consider two future scenarios:
Scenario 1: If scaling alone achieves AGI or even ASI, then human labor value approaches zero. In this case, AI as a huge black box solves all problems. The most urgent question becomes: How to ensure this superintelligent system always acts benevolently and does not secretly deceive or do harm? The answer inevitably involves interpretability research.
Scenario 2: If the scaling path hits a bottleneck and humans cannot meet exponential resource demands, we must find alternative approaches. Then, understanding “why models are effective and what causes their failure” becomes essential. Interpretability research is the foundation of this alternative path.
In either scenario, interpretability is key to solving the puzzle. Even if AI is omniscient and benevolent, human nature will drive us to explore why it can do so. After all, the “black box” itself breeds chains of suspicion.
In an era where large models reach or surpass human average levels, the “Dark Forest” law from The Three-Body Problem might manifest in another form. For now, opening the black box of trained models and understanding their internal circuits remains an initial task.
The real challenge of interpretability lies in: Starting from first principles—i.e., from model architecture, gradient descent, and the intrinsic structure of data—explaining why models converge to those disentangled, sparse, low-rank, modular, and compositional features. Why do so many equivalent explanations exist? What hyperparameters trigger the emergence of these structures? How are they interconnected?
When we can directly derive the inevitability of feature emergence in large models from the gradient descent equations, interpretability elevates from biological “evidence collection” to physical “principle deduction,” guiding practice and opening new paths for next-generation AI design.
To make an analogy with physics from four centuries ago: back then, we had Tycho Brahe (data gatherer in AI), some Keplers (hypothesis proposers), but no Newton (principle discoverer). When that moment arrives, the world will be unrecognizable.