PhD. Student |
I am currently a Ph.D. Student at Department of Computer Science and Engineering, Shanghai Jiao Tong University as well as a member of PAIR-Lab, co-advised by Prof. Yaodong Yang and Prof. Ying Wen. I am also selected into Wenjun Wu Honored Ph.D. Class in 2023. My research interest lies in AI Alignment and Preference-based RL. If you would like to discuss about potential collaboration or common research interests, please do not hesitate to contact me.
(* indicates equal contribution)
Efficient Model-agnostic Alignment via Bayesian Persuasion
Fengshuo Bai, Mingzhi Wang, Zhaowei Zhang, Boyuan Chen, Yinda Xu, Ying Wen, Yaodong Yang
arXiv preprint arXiv:2405.18718
Efficient Preference-based Reinforcement Learning via Aligned Experience Estimation
Fengshuo Bai, Rui Zhao, Hongming Zhang, Sijia Cui, Ying Wen, Yaodong Yang, Bo Xu, Lei Han
arXiv preprint arXiv:2310.00378
PEARL: Zero-shot Cross-task Preference Alignment and Robust Reward Learning for Robotic Manipulation
Runze Liu, Yali Du, Fengshuo Bai, Jiafei Lyu, Xiu Li
Conference on International Conference on Machine Learning (ICML), 2024
Measuring Value Understanding in Language Models through Discriminator-Critique Gap
Zhaowei Zhang, Fengshuo Bai*, Jun Gao*, Yaodong Yang
arXiv preprint arXiv:2310.00378
PiCor: multi-task deep reinforcement learning with policy correction
Fengshuo Bai, Hongming Zhang, Tianyang Tao, Zhiheng Wu, Yanna Wang, Bo Xu
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2023
[PDF]
Meta-reward-net: Implicitly differentiable reward learning for preference-based reinforcement learning
Runze Liu, Fengshuo Bai, Yali Du, Yaodong Yang
Conference on Neural Information Processing Systems (NeurIPS), 2022
[PDF] [Code] [Website]
Institute for AI, Peking University (Jan 2022 - Oct 2022)
Research on algorithms of Reinforcement Learning from the Human Feedback.
Supervisor: Dr. Yali Du, Dr. Yaodong Yang
Pacemaker to Merit Student, 2018.09
President Scholarship of Southeast University, 2018.09
China National Scholarship, 2017.09
Sliver Prize (25/1178), Kaggle Lux AI Challenge, 2021.12
Third Prize (Intro), 7th (Research) NeurIPS 2021 MineRL Diamond Competition, 2021.10
14th, IJCAI 2020 Mahjong Artificial Intelligence Competition, 2021.01
I was a reviewer / PC member of conferences:
DAI 2023
AAMAS 2024, 2025
NeurIPS 2024