Follow
Ziniu Li
Ziniu Li
Other namesZi-Niu Li
The Chinese University of Hong Kong, Shenzhen
Verified email at link.cuhk.edu.cn - Homepage
Title
Cited by
Cited by
Year
Error bounds of imitating policies and environments
T Xu, Z Li, Y Yu
Advances in Neural Information Processing Systems 33, 15737-15749, 2020
87*2020
Error bounds of imitating policies and environments for reinforcement learning
T Xu, Z Li, Y Yu
IEEE Transactions on Pattern Analysis and Machine Intelligence 44 (10), 6968 …, 2021
232021
Self-Guided Evolution Strategies with Historical Estimated Gradients
FY Liu, ZN Li, C Qian
IJCAI, 1474-1480, 2020
162020
Rethinking ValueDice - Does It Really Improve Performance?
Z Li, T Xu, Y Yu, ZQ Luo
ICLR Blog Track, 2022
142022
HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning
Z Li, Y Li, Y Zhang, T Zhang, ZQ Luo
International Conference on Learning Representations, 2022
112022
Understanding adversarial imitation learning in small sample regime: A stage-coupled analysis
T Xu, Z Li, Y Yu, ZQ Luo
arXiv preprint arXiv:2208.01899, 2022
8*2022
Policy Optimization in RLHF: The Impact of Out-of-preference Data
Z Li, T Xu, Y Yu
arXiv preprint arXiv:2312.10584, 2023
32023
Remax: A simple, effective, and efficient method for aligning large language models
Z Li, T Xu, Y Zhang, Y Yu, R Sun, ZQ Luo
arXiv preprint arXiv:2310.10505, 2023
32023
Imitation Learning from Imperfection: Theoretical Justifications and Algorithms
Z Li, T Xu, Z Qin, Y Yu, ZQ Luo
Advances in Neural Information Processing Systems 36, 2024
2*2024
Provably Efficient Adversarial Imitation Learning with Unknown Transitions
T Xu, Z Li, Y Yu, ZQ Luo
UAI, 2367-2378, 2023
22023
A Note on Target Q-learning For Solving Finite MDPs with A Generative Oracle
Z Li, T Xu, Y Yu
arXiv preprint arXiv:2203.11489, 2022
12022
Efficient Exploration by Novelty-Pursuit
Z Li, XH Chen
Distributed Artificial Intelligence: Second International Conference, DAI …, 2020
12020
Why Transformers Need Adam: A Hessian Perspective
Y Zhang, C Chen, T Ding, Z Li, R Sun, ZQ Luo
arXiv preprint arXiv:2402.16788, 2024
2024
Deploying Offline Reinforcement Learning with Human Feedback
Z Li, K Xu, L Liu, L Li, D Ye, P Zhao
arXiv preprint arXiv:2303.07046, 2023
2023
The system can't perform the operation now. Try again later.
Articles 1–14