Yuchen Zhuang

Yuchen Zhuang

Hello World! My name is Yuchen Zhuang. I am a Research Scientist at Google DeepMind. I am a core contributor of the Gemini Thinking / Reasoning team, including the Gemini 3.0 and Gemini 3.1 models. I obtained my Ph.D. in Machine Learning from Georgia Institute of Technology, advised by Prof. Chao Zhang. I am also very fortunate to work closely with Prof. Le Song and Prof. Bo Dai. My research focuses on building large language models (LLMs) and LLM-based agents with reasoning and planning capabilities for challenging real-world problems, e.g., math, coding, and gaming. My recent research covers the following directions:

  • Post-training and alignment: LLM post-training over different stages, including SFT, RLHF, preference optimization, and reward modeling for improving LLM capabilities in coding, thinking, and instruction following;
  • Advanced agentic coding: Enhancing model capabilities in complicated agentic coding tasks, e.g., software engineering (SWE) and machine learning engineering (MLE), via reinforcement learning and reward modeling;
  • Search- and tool-integrated learning: Effective and efficient frameworks that integrate external knowledge and tools into LLM reasoning, including retrieval-augmented generation (RAG), tool use, and personalization.

News

Selected Publications

Please refer to my Google Scholar for the full list. (* = equal contribution)


MLE-Smith: Scaling MLE Tasks with Automated Multi-Agent Pipeline
MedAgentGym: A Scalable Agentic Training Environment for Code-Centric Reasoning in Biomedical Data Science
MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering
Matryoshka Pilot: Learning to Drive Black-Box LLMs with LLMs
Hephaestus: Improving Fundamental Agent Capabilities of Large Language Models through Continual Pre-Training
HYDRA: Model Factorization Framework for Black-Box LLM Personalization
BBox-Adapter: Lightweight Adapting for Black-Box Large Language Models
ToolChain*: Efficient Action Space Navigation in Large Language Models with A* Search
AdaPlanner: Adaptive Planning from Feedback with Language Models
ToolQA: A Dataset for LLM Question Answering with External Tools
Large Language Model as Attributed Training Data Generator: A Tale of Diversity and Bias

Experience

Google DeepMind, Gemini Thinking
Research Scientist • May 2025 - Present
Topic: Gemini for reasoning and coding
Amazon, Rufus Team
Applied Scientist Intern • May 2024 - Aug 2024
Topic: Pre-Training Agent LLM to Enhance Fundamental Agentic Capabilities
Adobe Research
Research Scientist Intern • May 2023 - Aug 2023
Topic: ToolChain* - Efficient Action Space Navigation with A* Search [ICLR'24]
Amazon, Personalization Team
Applied Scientist Intern • May 2022 - Aug 2022
Topic: G-STO - Sequential Shopping Intention Detection [CIKM'23]

Academic Services

  • Area Chair: NeurIPS 2026-, ICML 2026-, COLM 2026-, ACL 2024-, EMNLP 2024-, NAACL 2025-.
  • Conference Program Committee: NeurIPS 2023-2025; ICLR 2023-2026; ICML 2023-2025; COLM 2024-2025; KDD 2021-2023; ACL 2021-2024; AAAI 2023-2024; AISTATS 2024-2025; SDM 2024.

Selected Awards

  • [2024] J.P. Morgan Chase AI PhD Fellowship
  • [2023] NeurIPS Scholar Award
  • [2023] Best Paper Award, ACM BCB
  • [2023] ACM SIGKDD Student Travel Grant
  • [2020] Second Prize, Excellent Undergraduate Graduation Thesis, Jiangsu Province
  • [2019] Most Influential Graduate Award Nomination (Top 20/4,000), Southeast University
  • [2018-2019] Qingyun Sun Innovation Scholarship, Southeast University