Multi-Agent Safe Planning with Gaussian Processes Z Zhu, E Biyik, D Sadigh 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems …, 2020 | 21 | 2020 |
Scalable Neural Contextual Bandit for Recommender Systems Z Zhu, B Van Roy 32nd ACM International Conference on Information and Knowledge Management …, 2023 | 15 | 2023 |
Evaluating Online Bandit Exploration In Large-Scale Recommender System H Guo, R Naeff, A Nikulkov, Z Zhu KDD-23 Workshop on Multi-Armed Bandits and Reinforcement Learning: Advancing …, 2023 | 14 | 2023 |
Deep Exploration for Recommendation Systems Z Zhu, B Van Roy 17th ACM Conference on Recommender Systems (RecSys 2023), 2023 | 13 | 2023 |
Optimizing Long-term Value for Auction-Based Recommender Systems via On-Policy Reinforcement Learning R Xu, J Bhandari, D Korenkevych, F Liu, Y He, A Nikulkov, Z Zhu 17th ACM Conference on Recommender Systems (RecSys 2023), 2023 | 10 | 2023 |
Learning-Based Two-Tiered Online Optimization of Region-Wide Datacenter Resource Allocation CL Chen, H Zhou, J Chen, M Pedramfar, T Lan, Z Zhu, C Zhou, PM Ruiz, ... IEEE Transactions on Network and Service Management, 2024 | 9* | 2024 |
IQL-TD-MPC: Implicit Q-Learning for Hierarchical Model Predictive Control Y Xu, R Chitnis, BT Hashemi, L Lehnert, U Dogan, Z Zhu, O Delalleau 2024 IEEE International Conference on Robotics and Automation, 2024 | 8* | 2024 |
Pearl: A Production-Ready Reinforcement Learning Agent Z Zhu, R de Salvo Braz, J Bhandari, D Jiang, Y Wan, Y Efroni, L Wang, ... Journal of Machine Learning Research 25 (273), 1-30, 2024 | 4 | 2024 |
Uncovering the global terrorism network J Alison*, L Deng*, Z Zhu* | 4* | 2017 |
Non-Stationary Contextual Bandit Learning via Neural Predictive Ensemble Sampling Z Zhu, Y Liu, X Kuang, B Van Roy arXiv preprint arXiv:2310.07786, 2023 | 3 | 2023 |
Offline reinforcement learning for optimizing production bidding policies D Korenkevych, F Cheng, A Balakir, A Nikulkov, L Gao, Z Cen, Z Xu, ... Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and …, 2024 | 1 | 2024 |
Learning to bid and rank together in recommendation systems G Ji, W Jiang, J Li, FM Fahid, Z Chen, Y Li, J Xiao, C Bao, Z Zhu Machine Learning 113 (5), 2559-2573, 2024 | 1 | 2024 |
Aligned Multi-Objective Optimization Y Efroni, D Jiang, B Kretzu, J Bhandari, Z Zhu, K Ullrich OPT 2024: Optimization for Machine Learning, 2024 | 1 | 2024 |
Epinet for Content Cold Start HJ Jeon, S Liu, Y Li, J Lyu, H Song, J Liu, P Wu, Z Zhu arXiv preprint arXiv:2412.04484, 2024 | | 2024 |
Exploiting Structure in Offline Multi-Agent RL: The Benefits of Low Interaction Rank W Zhan, S Fujimoto, Z Zhu, JD Lee, DR Jiang, Y Efroni arXiv preprint arXiv:2410.01101, 2024 | | 2024 |
Uncertainty of Joint Neural Contextual Bandit H Guo, Z Zhu arXiv preprint arXiv:2406.02515, 2024 | | 2024 |
An Empirical Study of Deep Reinforcement Learning in Continuing Tasks Y Wan, D Korenkevych, Z Zhu | | 2024 |
Efficient Deep Reinforcement Learning for Recommender Systems Z Zhu https://searchworks.stanford.edu/view/in00000031069, 2023 | | 2023 |