Yihao Sun

M.Sc. Student, LAMDA Group
Department of Computer Science and Technology
National Key Laboratory for Novel Software Technology Nanjing University
Supervisor: Prof. Yang Yu
Email: sunyh@lamda.nju.edu.cn

[ Github ] [ Zhihu ]

Biography

Currently I am a third year graduate student of School of Artificial Intelligence in Nanjing University and a member of LAMDA Group, led by Professor Zhi-Hua Zhou.

I received my B.Sc. degree in Software Enginering in School of software in June 2021 from Sichuan University. In September 2021, I was admitted to study for M.Sc. degree in Nanjing University under the supervision of Prof. Yang Yu without entrance examination, respectively.

Research Interests

My research interest is reinforcement learning (RL). Currently, I focus on offline reinforcement learning (Offline RL) and model-based reinforcement learning (MBRL).

News

May 3, 2024 Our work Policy Representation Can be Utilized for More Generalizable Offline Dynamics Model Learning is accepted by ICML 2024!
Jan 16, 2024 Our work Flow to better: Offline preference-based reinforcement learning via preferred trajectory generation is accepted by ICLR 2024!
Dec 9, 2023 Our work Episodic return decomposition by difference of implicitly assigned sub-trajectory reward is accepted by AAAI 2024!
Sep 22, 2023 I am awarded Xiaomi Outstanding Scholarship (10 recipients schoolwide)!
Jul 16, 2023 Our work Model-based reinforcement learning with multi-step plan value estimation is accepted by ECAI 2023!
May 18, 2023 Our repository OfflineRL-Kit has reached the milestone of 100 stars!
Apr 25, 2023 Our work Model-Bellman inconsistency for model-based offline reinforcement learning is accepted by ICML 2023!

Selected Publications

* indicates equal contribution
  1. ICML
    Model-Bellman inconsistency for model-based offline reinforcement learning | [ Link Code ]
    Yihao Sun* , Jiaji Zhang*, Chengxing Jia, Haoxin Lin, Junyin Ye, and Yang Yu.
    In Proceedings of the 40th International Conference on Machine Learning (ICML’23). 2023.
  2. ICLR
    Flow to better: Offline preference-based reinforcement learning via preferred trajectory generation | [ Link Code ]
    Zhilong Zhang*, Yihao Sun* , Junyin Ye, Tianshuo Liu, Jiaji Zhang, and Yang Yu.
    In Proceedings of the 12th International Conference on Learning Representations (ICLR’24). 2024.


    Correspondence


    Email: sunyh@lamda.nju.edu.cn

    Laboratory: Shaoyifu Building, Xianlin Campus of Nanjing University

    Address: National Key Laboratory for Novel Software Technology, Nanjing University, Xianlin Campus Mailbox 603, 163 Xianlin Avenue, Qixia District, Nanjing 210023, China.
    南京市栖霞区仙林大道163号, 南京大学仙林校区603信箱, 软件新技术国家重点实验室, 210023.