About me
I am a Ph.D. student in the RLAI lab at the University of Alberta, under the supervision of Dr. Martha White and Dr. Adam White. I am passionate about reinforcement learning, with a primary research interest in using offline data to accelerate online learning.
Education
-
Doctor of Philosophy in Computing Science
Sept 2020 - Present
Expected graduation: 2025Supervised by Dr. Martha White and Dr. Adam White
RLAI lab, University of Alberta
Reinforcement learning, Offline RL, Offline to online -
Master of Science in Computing Science
Sept 2017 - Sept 2020Supervised by Dr. Adam White and Dr. Martha White
RLAI lab, University of Alberta
Reinforcement learning, Representation learning -
Bachelor of Science with Honors in Computing Science
Sept 2013 - June 2017University of Alberta
Graduated with first class honors
Work History
-
Intern
Nov 2023 - Nov 2024Reinforcement learning in industrial control
RL Core -
Research Intern
May 2022 - Dec 2022Offline reinforcement learning
Noah's Ark Lab, Huawei Technologies Canada Co., Ltd. -
Teaching Assistant
Winter 2022CMPUT 267 - Basics of Machine Learning
Fall 2018, Fall 2017, Fall 2016
University of AlbertaCMPUT 366 - Intelligent Systems, University of Alberta
University of Alberta -
Programmer
July 2016Henan Yufa Property Limited Company
Publications
* denotes equal contribution
-
q-Exponential Family for Policy Optimization
ICLR, 2025Lingwei Zhu*, Haseeb Shah*, Han Wang*, Martha White
-
Fat-to-Thin Policy Optimization: Offline Reinforcement Learning with Sparse Policies
ICLR, 2025Lingwei Zhu*, Han Wang*, Yukie Nagai
-
Investigating the Properties of Neural Network Representations in Reinforcement Learning
AIJ, 2024Han Wang, Erfan Miahi, Martha White, Marlos C. Machado, Zaheer Abbas, Raksha Kumaraswamy, Vincent Liu, Adam White
-
A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization
RLC, 2024Yudong Luo, Yangchen Pan, Han Wang, Philip Torr, Pascal Poupart
-
Exploiting the Replay Memory Before Exploring the Environment: Enhancing Reinforcement Learning Through Empirical MDP Iteration
NeurIPS, 2024Hongming Zhang, Chenjun Xiao, Chao Gao, Han Wang, Bo Xu, Martin Müller
-
In-sample Sparsemax for Offline Reinforcement Learning by Tsallis Regularization
TMLR, 2024Lingwei Zhu, Matthew Kyle Schlegel, Han Wang, Martha White
-
The In-Sample Softmax for Offline Reinforcement Learning
ICLR, 2023Chenjun Xiao*, Han Wang*, Yangchen Pan, Adam White, Martha White
-
Measuring and Mitigating Interference in Reinforcement Learning
CoLLAs, 2023Vincent Liu, Han Wang, Ruo Yu Tao, Khurram Javed, Adam White, Martha White
-
Replay Memory as An Empirical MDP: Combining Conservative Estimation with Experience Replay
ICLR, 2023Hongming Zhang, Chenjun Xiao, Han Wang, Jun Jin, Bo Xu, Martin Müller
-
No More Pesky Hyperparameters: Offline Hyperparameter Tuning for RL
TMLR, 2022Han Wang*, Archit Sakhadeo*, Adam White, James Bell, Vincent Liu, Xutong Zhao, Puer Liu, Tadashi Kozuno, Alona Fyshe, Martha White
Preprints
* denotes equal contribution
-
An offline RL algorithm learning sparse policy
Under Review, title and PDF are not available yet -
q-Exponential Family for Policy Optimization
Under ReviewLingwei Zhu*, Haseeb Shah*, Han Wang*, Martha White
-
Avoiding Model Estimation in Robust Markov Decision Processes with a Generative Model
Under ReviewWenhao Yang, Han Wang, Tadashi Kozuno, Scott M. Jordan, Zhihua Zhang