Fengshuo Bai

PhD. Student
Department of Computer Science and Engineering
Shanghai Jiao Tong University

Email: fengshuobai [@] sjtu [DOT] edu [DOT] cn

Biography

I am currently a Ph.D. Student at Department of Computer Science and Engineering, Shanghai Jiao Tong University as well as a member of PAIR-Lab, co-advised by Prof. Yaodong Yang and Prof. Ying Wen. I am also selected into Wenjun Wu Honored Ph.D. Class in 2023, advised by Prof. Cewu Lu. My research interest lies in Dexterous Manipulation, Preference-based RL and AI Alignment. If you would like to discuss about potential collaboration or common research interests, please do not hesitate to contact me.

News 🔥

📢 2025.7 We have released our VLA survey titled “A Survey on Vision-Language-Action Models: An Action Tokenization Perspective.”
📢 2025.6 Our paper has been accepted for oral presentation at the Artificial General Intelligence Conference (AGI-25)!

Selected Papers

(* indicates equal contribution)

A Survey on Vision-Language-Action Models: An Action Tokenization Perspective
Yifan Zhong*, Fengshuo Bai*, Shaofei Cai, Xuchuan Huang, Zhang Chen, Xiaowei Zhang, Yuanfei Wang, Shaoyang Guo, Tianrui Guan, Ka Nam Lui, Zhiquan Qi, Yitao Liang, Yuanpei Chen, Yaodong Yang

Paper ArXiv HF Github

Roadmap on Incentive Compatibility for Al Alignmentand Governance in Sociotechnical Systems Oral
Zhaowei Zhang, Fengshuo Bai, Mingzhi Wang, Haoyang Ye, Chengdong Ma, Yaodong Yang
Artificial General Intelligence Conference (AGI), 2025

Paper ArXiv

Retrieval Dexterity: Efficient Object Retrieval in Clutters with Dexterous Hand
Fengshuo Bai, Yu Li, Jie Chu, Tawei Chou, Runchuan Zhu, Ying Wen, Yaodong Yang, Yuanpei Chen

Paper ArXiv Video Code

Amulet: ReAlignment During Test Time for Personalized Preference Adaptation of LLMs
Zhaowei Zhang*, Fengshuo Bai*, Qizhi Chen, Chengdong Ma, Mingzhi Wang, Haoran Sun, Zilong Zheng, Yaodong Yang
International Conference on Learning Representations (ICLR), 2025

Paper ArXiv Video Code

GRAIT: Gradient-Driven Refusal-Aware Instruction Tuning for Effective Hallucination Mitigation
Runchuan Zhu, Xinke Jiang, Jiang Wu, Zhipeng ma, Jiahe Song, Fengshuo Bai, Dahua Lin, Lijun Wu, Conghui He
Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL), 2025

Paper ArXiv Video Code

β-DQN: Improving Deep Q-Learning By Evolving the Behavior Oral
Hongming Zhang, Fengshuo Bai, Chenjun Xiao, Chao Gao, Bo Xu, Martin Müller
International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2025

Paper ArXiv Video Code

RAT: Adversarial Attacks on Deep Reinforcement Agents for Targeted Behaviors Oral
Fengshuo Bai, Runze Liu, Yali Du, Ying Wen, Yaodong Yang
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2025

Paper ArXiv Video Code

Efficient Model-agnostic Alignment via Bayesian Persuasion
Fengshuo Bai, Mingzhi Wang, Zhaowei Zhang, Boyuan Chen, Yinda Xu, Ying Wen, Yaodong Yang

Paper ArXiv Video Code

Efficient Preference-based Reinforcement Learning via Aligned Experience Estimation
Fengshuo Bai, Rui Zhao, Hongming Zhang, Sijia Cui, Ying Wen, Yaodong Yang, Bo Xu, Lei Han

Paper ArXiv Video Code

PEARL: Zero-shot Cross-task Preference Alignment and Robust Reward Learning for Robotic Manipulation
Runze Liu, Yali Du, Fengshuo Bai, Jiafei Lyu, Xiu Li
Conference on International Conference on Machine Learning (ICML), 2024

Paper ArXiv Video Code

Measuring Value Understanding in Language Models through Discriminator-Critique Gap
Zhaowei Zhang, Fengshuo Bai*, Jun Gao*, Yaodong Yang

Paper ArXiv Video Code

PiCor: multi-task deep reinforcement learning with policy correction Oral
Fengshuo Bai, Hongming Zhang, Tianyang Tao, Zhiheng Wu, Yanna Wang, Bo Xu
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2023

Paper ArXiv Video Code

Meta-reward-net: Implicitly differentiable reward learning for preference-based reinforcement learning
Runze Liu, Fengshuo Bai, Yali Du, Yaodong Yang
Conference on Neural Information Processing Systems (NeurIPS), 2022

Paper ArXiv Video Code

Experiences

Institute for AI, Peking University (Jan 2022 - Oct 2022)
Research on algorithms of Reinforcement Learning from the Human Feedback.
Supervisor: Dr. Yali Du, Dr. Yaodong Yang

Honors & Awards

Pacemaker to Merit Student, 2018.09
President Scholarship of Southeast University, 2018.09
China National Scholarship, 2017.09

Competitions

Sliver Prize (25/1178), Kaggle Lux AI Challenge, 2021.12
Third (Intro), 7th (Research) NeurIPS 2021 MineRL Diamond Competition, 2021.10
14th, IJCAI 2020 Mahjong Artificial Intelligence Competition, 2021.01

Services

I was a reviewer / PC member of conferences:
DAI 2023-2024
AAMAS 2024-2025
NeurIPS 2024-2025
ICLR 2025
AAAI 2025
ICML 2025
IJCAI 2025
IROS 2025

Fengshuo Bai

Biography

News 🔥

Selected Papers

Experiences

Honors & Awards

Competitions

Services

Visitors