Grandmaster level in StarCraft II using multi-agent reinforcement learning O Vinyals, I Babuschkin, WM Czarnecki, M Mathieu, A Dudzik, J Chung, ... nature 575 (7782), 350-354, 2019 | 4686 | 2019 |
Rainbow: Combining improvements in deep reinforcement learning M Hessel, J Modayil, H Van Hasselt, T Schaul, G Ostrovski, W Dabney, ... Thirty-Second AAAI Conference on Artificial Intelligence, 2018 | 2745 | 2018 |
Gemini: a family of highly capable multimodal models G Team, R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, ... arXiv preprint arXiv:2312.11805, 2023 | 1548 | 2023 |
Deep Q-learning from Demonstrations T Hester, M Vecerik, O Pietquin, M Lanctot, T Schaul, B Piot, D Horgan, ... Association for the Advancement of Artificial Intelligence (AAAI), 2018 | 1279 | 2018 |
Universal Value Function Approximators T Schaul, D Horgan, K Gregor, D Silver Proceedings of the 32nd International Conference on Machine Learning (ICML …, 2015 | 1242 | 2015 |
Distributed Prioritized Experience Replay D Horgan, J Quan, D Budden, G Barth-Maron, M Hessel, H van Hasselt, ... International Conference on Learning Representations 2018, 2018 | 904 | 2018 |
Distributed distributional deterministic policy gradients G Barth-Maron, MW Hoffman, D Budden, W Dabney, D Horgan, D Tb, ... arXiv preprint arXiv:1804.08617, 2018 | 647 | 2018 |
Alphastar: Mastering the real-time strategy game starcraft ii O Vinyals, I Babuschkin, J Chung, M Mathieu, M Jaderberg, ... DeepMind blog 2, 20, 2019 | 558 | 2019 |
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context M Reid, N Savinov, D Teplyashin, D Lepikhin, T Lillicrap, J Alayrac, ... arXiv preprint arXiv:2403.05530, 2024 | 377 | 2024 |
Observe and look further: Achieving consistent performance on atari T Pohlen, B Piot, T Hester, MG Azar, D Horgan, D Budden, G Barth-Maron, ... arXiv preprint arXiv:1805.11593, 2018 | 140 | 2018 |
Unicorn: Continual learning with a universal, off-policy agent DJ Mankowitz, A Žídek, A Barreto, D Horgan, M Hessel, J Quan, J Oh, ... arXiv preprint arXiv:1802.08294, 2018 | 49 | 2018 |
Selecting reinforcement learning actions using goals and observations T Schaul, DG Horgan, K Gregor, D Silver US Patent 10,628,733, 2020 | 14 | 2020 |
Reinforcement learning using distributed prioritized replay D Budden, G Barth-Maron, J Quan, DG Horgan US Patent 11,625,604, 2023 | 11 | 2023 |
Vision-Language Models as a Source of Rewards K Baumli, S Baveja, F Behbahani, H Chan, G Comanici, S Flennerhag, ... arXiv preprint arXiv:2312.09187, 2023 | 7 | 2023 |
Reinforcement learning using distributed prioritized replay D Budden, G Barth-Maron, J Quan, DG Horgan US Patent App. 18/131,753, 2023 | | 2023 |
Towards Consistent Performance on Atari using Expert Demonstrations T Pohlen, B Piot, T Hester, MG Azar, D Horgan, D Budden, G Barth-Maron, ... | | |