About me

I am a Ph.D. student in the RLAI lab at the University of Alberta, under the supervision of Dr. Martha White and Dr. Adam White. I am passionate about reinforcement learning, with a primary research interest in using offline data to accelerate online learning. Additionally, I am interested in representation learning. Currently, I am an intern at RL Core Technologies, where I apply reinforcement learning to real-world problems.


  1. Doctor of Philosophy in Computing Science

    Sept 2020 - Present
    Expected graduation: Early 2025

    Supervised by Dr. Martha White and Dr. Adam White
    RLAI lab, University of Alberta
    Reinforcement learning, Offline to online

  2. Master of Science in Computing Science

    Sept 2017 - Sept 2020

    Supervised by Dr. Martha White and Dr. Adam White
    RLAI lab, University of Alberta
    Reinforcement learning, Representation learning

  3. Bachelor of Science with Honors in Computing Science

    Sept 2013 - June 2017

    University of Alberta
    Graduated with first class honors


* denotes equal contribution

  1. Investigating the Properties of Neural Network Representations in Reinforcement Learning

    AIJ, 2024

    Han Wang, Erfan Miahi, Martha White, Marlos C. Machado, Zaheer Abbas, Raksha Kumaraswamy, Vincent Liu, Adam White

  2. A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization

    RLC, 2024

    Yudong Luo, Yangchen Pan, Han Wang, Philip Torr, Pascal Poupart

  3. Measuring and Mitigating Interference in Reinforcement Learning

    CoLLAs, 2023

    Vincent Liu, Han Wang, Ruo Yu Tao, Khurram Javed, Adam White, Martha White

  4. In-sample Sparsemax for Offline Reinforcement Learning by Tsallis Regularization

    TMLR, 2023

    Lingwei Zhu, Matthew Kyle Schlegel, Han Wang, Martha White

  5. The In-Sample Softmax for Offline Reinforcement Learning

    ICLR, 2022

    Chenjun Xiao*, Han Wang*, Yangchen Pan, Adam White, Martha White

  6. Replay Memory as An Empirical MDP: Combining Conservative Estimation with Experience Replay

    ICLR, 2022

    Hongming Zhang, Chenjun Xiao, Han Wang, Jun Jin, Bo Xu, Martin Müller

  7. No More Pesky Hyperparameters: Offline Hyperparameter Tuning for RL

    TMLR, 2022

    Han Wang*, Archit Sakhadeo*, Adam White, James Bell, Vincent Liu, Xutong Zhao, Puer Liu, Tadashi Kozuno, Alona Fyshe, Martha White


* denotes equal contribution

  1. q-Exponential Family for Policy Optimization

    Under Review, PDF not available

    Lingwei Zhu*, Haseeb Shah*, Han Wang*, Martha White

  2. Avoiding Model Estimation in Robust Markov Decision Processes with a Generative Model

    Under Review

    Wenhao Yang, Han Wang, Tadashi Kozuno, Scott M. Jordan, Zhihua Zhang

  3. Exploiting the Replay Memory Before Exploring the Environment: Enhancing Reinforcement Learning Through Empirical MDP Iteration

    Under Review, PDF not available

    Hongming Zhang, Chenjun Xiao, Chao Gao, Han Wang, Bo Xu, Martin Müller


  1. Emergent Representations in Reinforcement Learning and Their Properties

    M.Sc., 2020

Work History

  1. Intern

    Nov 2023 - Now

    Reinforcement learning in industry
    RL Core

  2. Research Intern

    May 2022 - Dec 2022

    Offline reinforcement learning
    Noah's Ark Lab, Huawei Technologies Canada Co., Ltd.

  3. Teaching Assistant

    Winter 2022

    CMPUT 267 - Basics of Machine Learning
    University of Alberta

  4. Teaching Assistant

    Fall 2018, Fall 2017, Fall 2016

    CMPUT 366 - Intelligent Systems, University of Alberta
    University of Alberta

  5. Programmer

    July 2016

    Henan Yufa Property Limited Company