Volgen
Yangchen Pan
Yangchen Pan
Geverifieerd e-mailadres voor eng.ox.ac.uk - Homepage
Titel
Geciteerd door
Geciteerd door
Jaar
Maxmin q-learning: Controlling the estimation bias of q-learning
Q Lan, Y Pan, A Fyshe, M White
International Conference on Learning Representations 2020, 2020
1632020
Fuzzy tiling activations: A simple approach to learning sparse representations online
Y Pan, K Banman, M White
ICLR 2022, 2022
77*2022
Organizing experience: a deeper look at replay mechanisms for sample-based planning in continuous state domains
Y Pan, M Zaheer, A White, A Patterson, M White
IJCAI, 2018
532018
Accelerated gradient temporal difference learning
Y Pan, A White, M White
Proceedings of the AAAI Conference on Artificial Intelligence 31 (1), 2017
312017
Greedy Actor-Critic: A New Conditional Cross-Entropy Method for Policy Improvement
S Neumann, S Lim, A Joseph, Y Pan, A White, M White
ICLR 2023, 2023
23*2023
The In-Sample Softmax for Offline Reinforcement Learning
C Xiao, H Wang, Y Pan, A White, M White
ICLR 2023, 2023
172023
Hill climbing on value estimates for search-control in Dyna
Y Pan, H Yao, A Farahmand, M White
IJCAI 2019, 2019
172019
Reinforcement learning with function-valued action spaces for partial differential equation control
Y Pan, A Farahmand, M White, S Nabi, P Grover, D Nikovski
International Conference on Machine Learning, 3986-3995, 2018
162018
Understanding and mitigating the limitations of prioritized experience replay
Y Pan, J Mei, A Farahmand, M White, H Yao, M Rohani, J Luo
Uncertainty in Artificial Intelligence, 1561-1571, 2022
152022
Incremental truncated LSTD
C Gehring, Y Pan, M White
IJCAI 2016, 2016
152016
Frequency-based Search-control in Dyna
Y Pan, J Mei, A Farahmand
ICLR 2020, 2020
142020
Effective sketching methods for value function approximation
Y Pan, ES Azer, M White
Uncertainty in Artificial Intelligence 2017, 2017
112017
Adapting kernel representations online using submodular maximization
M Schlegel, Y Pan, J Chen, M White
International Conference on Machine Learning, 3037-3046, 2017
92017
Memory-efficient Reinforcement Learning with Value-based Knowledge Consolidation
Q Lan, Y Pan, J Luo, AR Mahmood
Transactions on Machine Learning Research, 2023
7*2023
An implicit function learning approach for parametric modal regression
Y Pan, E Imani, A Farahmand, M White
Advances in Neural Information Processing Systems 33, 11442-11452, 2020
62020
Understanding the robustness difference between stochastic gradient descent and adaptive gradient methods
A Ma, Y Pan, A Farahmand
Transactions on Machine Learning Research, 2023
32023
An Alternate Policy Gradient Estimator for Softmax Policies
S Garg, S Tosatto, Y Pan, M White, AR Mahmood
AISTATS 2022, 2021
32021
An Alternative to Variance: Gini Deviation for Risk-averse Policy Gradient
Y Luo, G Liu, P Poupart, Y Pan
2023 Advances in Neural Information Processing Systems (NeurIPS), 2023
12023
Conditionally Optimistic Exploration for Cooperative Deep Multi-Agent Reinforcement Learning
X Zhao, Y Pan, C Xiao, S Chandar, J Rajendran
UAI 2023, 2023
12023
A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization
Y Luo, Y Pan, H Wang, P Torr, P Poupart
arXiv preprint arXiv:2403.11062, 2024
2024
Het systeem kan de bewerking nu niet uitvoeren. Probeer het later opnieuw.
Artikelen 1–20