I am a third-year undergraduate majoring in Physics at Shanghai Jiaotong University, with a great interst in AI.

See Research interests part for more information about what I am intersted, and see my CV for more details. If you share a similar passion with me, feel free to reach out and connect with me!

news

Education

  • BSc in Physics, (Zhiyuan Honor Collage) Shanghai Jiaotong University, 2027 (Expected)

Research interests

I mainly focus on how to enable models to emergently develop generalized reasoning abilities at a sustainale scale with RL. I believe that as the scale of systems increases, whether a model can emergently develop generalized reasoning abilities without relying on external supervision is a key step toward AGI. My research interests in this area include:

  1. sclabel RL:
    • Emergent Reasoning Ability: whether RL can maximize exploration and reasoning in models without relying on well-predefined structure or SFT. I also explore the potential for pure RL to drive the emergence of latent reasoning processes, particularly in context of CoT reasoning.
    • RL stability and robustness: a critical issue of current RL framework is instability, particularly in phenomenon of train-inference mismatch. In case that the current AI Infra issue of mismatch cannot be addressed easily, I aim to uncover the underlying cause of such instability and develop reliable RL training recipe.
    • RL self-evolving: develop scalable RL frameworks that enable models to autonomously improve their reasoning and performance without constant external reward, pushing towards more adaptable and self-sustaining systems.
  2. AI + formal verifier system:
    • I am interested in building scalable solutions where AI systems can autonomously verify their own reasoning and results through interaction with a verifier system(PLVR)[^4], which might be a potential pathway to achieving scalability.
    • How to design task generation mechanisms that push the model to propose tasks that are tough but not too difficult. This touches on concepts like curriculum learning or auto self-play.

Research experience

  • Sep 2025 - Present: Research Intern @ THU C3I. Supervised by Ning Ding

  • July 2025 - Dec 2025: Research Intern @ Big AI Dream Lab, Shanghai AI Lab. Supervised by Jie Fu

  • March 2025 - May 2025: Research Intern Supervised by Prof.Junchi Yan and Prof.Renxiu Xia @ SAI, Shanghai Jiaotong University
    • Hierarchical GeoBench:  Propose a hierarchical benchmark featuring four reasoning levels in geometric problem-solving: Visual Perception, Goal-Oriented Planning, Rigorous Theorem Application, and Self-Reflective Backtracking.
    • have been accepted by ICLR 2026!
  • Augest 2024: Research Assistant in Zhangjiang National Lab, where I completed an AI4PHY project on my own.