Expert Recognition
NVIDIA Senior Research Manager Jim Fan recently shared his in-depth evaluation of DeepSeek R1 on social media. As the co-founder of GEAR Lab, lead of Project GR00T, Stanford Ph.D., and OpenAI's first intern, Fan's perspectives carry significant weight in the industry. He particularly emphasized DeepSeek's outstanding contributions to AI open-source development as a non-US company.
Inheritor of the Open-Source Spirit
In his commentary, Fan noted: "We are living in a timeline where a non-US company is keeping the original mission of OpenAI alive - truly open, frontier research that empowers all. It makes no sense. The most entertaining outcome is the most likely." He particularly appreciated that DeepSeek not only open-sources a barrage of models but also spills all the training secrets.
Deep Analysis of Technical Innovations
After carefully reading DeepSeek R1's technical paper, Fan highlighted several key technical breakthroughs:
-
Pure Reinforcement Learning Approach:
- Employs a "cold start" method, purely driven by RL, with no SFT at all
- Reminiscent of AlphaZero's breakthrough in mastering Go, Shogi, and Chess from scratch
- Considered the most significant takeaway from the paper
-
Innovative Reward Mechanism:
- Uses groundtruth rewards computed by hardcoded rules
- Avoids learned reward models that RL can easily hack against
-
Evolution of Thinking Time:
- Model's thinking time steadily increases as training proceeds
- This is an emergent property, not pre-programmed behavior
-
GRPO Algorithm Innovation:
- Removes the critic net from PPO
- Uses the average reward of multiple samples instead
- Simple method to reduce memory use
- Notably, GRPO was invented by DeepSeek in February 2024
New Paradigm of Technical Impact
Fan specifically pointed out that impact in AI can be achieved in different ways: "Impact can be done by 'ASI achieved internally' or mythical names like 'Project Strawberry'. Impact can also be done by simply dumping the raw algorithms and matplotlib learning curves." This perspective emphasizes the importance of openness and transparency.
Example of Sustained Innovation
In Fan's view, DeepSeek is perhaps the first open-source project that shows major, sustained growth of an RL flywheel. This continuous technical progress and open attitude sets an important benchmark for the entire AI community.
Conclusion
Jim Fan's evaluation not only affirms DeepSeek R1's technical achievements but also emphasizes its significant contributions to AI democratization and the open-source spirit. As an industry authority, his recognition further confirms DeepSeek's important position in the global AI landscape.
To explore DeepSeek R1's innovations firsthand, visit DeepSeek R1 Chat.