Volgen
Jakub Grudzien Kuba
Jakub Grudzien Kuba
Geverifieerd e-mailadres voor berkeley.edu - Homepage
Titel
Geciteerd door
Geciteerd door
Jaar
Trust region policy optimisation in multi-agent reinforcement learning
JG Kuba, R Chen, M Wen, Y Wen, F Sun, J Wang, Y Yang
International Conference on Learning Representations 2022, 2021
1522021
Multi-agent reinforcement learning is a sequence modeling problem
M Wen, J Kuba, R Lin, W Zhang, Y Wen, J Wang, Y Yang
Advances in Neural Information Processing Systems 35, 16509-16521, 2022
1052022
Safe multi-agent reinforcement learning for multi-robot control
S Gu, JG Kuba, Y Chen, Y Du, L Yang, A Knoll, Y Yang
Artificial Intelligence 319, 103905, 2023
55*2023
Idql: Implicit q-learning as an actor-critic method with diffusion policies
P Hansen-Estruch, I Kostrikov, M Janner, JG Kuba, S Levine
arXiv preprint arXiv:2304.10573, 2023
442023
Settling the variance of multi-agent policy gradients
JG Kuba, M Wen, L Meng, H Zhang, D Mguni, J Wang, Y Yang
Advances in Neural Information Processing Systems 34, 13458-13470, 2021
442021
Discovered policy optimisation
C Lu, J Kuba, A Letcher, L Metz, C Schroeder de Witt, J Foerster
Advances in Neural Information Processing Systems 35, 16455-16468, 2022
352022
Heterogeneous-agent mirror learning: A continuum of solutions to cooperative marl
JG Kuba, X Feng, S Ding, H Dong, J Wang, Y Yang
arXiv preprint arXiv:2208.01682, 2022
19*2022
Mirror learning: A unifying framework of policy optimisation
J Grudzien, CAS De Witt, J Foerster
International Conference on Machine Learning, 7825-7844, 2022
18*2022
Understanding value decomposition algorithms in deep cooperative multi-agent reinforcement learning
Z Dou, JG Kuba, Y Yang
arXiv preprint arXiv:2202.04868, 2022
62022
Functional Graphical Models: Structure Enables Offline Data-Driven Optimization
JG Kuba, M Uehara, P Abbeel, S Levine
arXiv preprint arXiv:2401.05442, 2024
2024
Advantage-Conditioned Diffusion: Offline RL via Generalization
JG Kuba, P Abbeel, S Levine
2023
Het systeem kan de bewerking nu niet uitvoeren. Probeer het later opnieuw.
Artikelen 1–11