An optimal algorithm for stochastic and adversarial bandits J Zimmert, Y Seldin The 22nd International Conference on Artificial Intelligence and Statistics …, 2019 | 125 | 2019 |
Tsallis-inf: An optimal algorithm for stochastic and adversarial bandits J Zimmert, Y Seldin Journal of Machine Learning Research 22 (28), 1-49, 2021 | 118 | 2021 |
Model selection in contextual stochastic bandit problems A Pacchiano, M Phan, Y Abbasi Yadkori, A Rao, J Zimmert, T Lattimore, ... Advances in Neural Information Processing Systems 33, 10328-10337, 2020 | 105 | 2020 |
Adapting to misspecification in contextual bandits DJ Foster, C Gentile, M Mohri, J Zimmert Advances in Neural Information Processing Systems 33, 11478-11489, 2020 | 102 | 2020 |
Beating stochastic and adversarial semi-bandits optimally and simultaneously J Zimmert, H Luo, CY Wei International Conference on Machine Learning, 7683-7692, 2019 | 93 | 2019 |
An optimal algorithm for adversarial bandits with arbitrary delays J Zimmert, Y Seldin International Conference on Artificial Intelligence and Statistics, 3285-3294, 2020 | 57 | 2020 |
A model selection approach for corruption robust reinforcement learning CY Wei, C Dann, J Zimmert International Conference on Algorithmic Learning Theory, 1043-1096, 2022 | 53 | 2022 |
Connections between mirror descent, Thompson sampling and the information ratio J Zimmert, T Lattimore Advances in neural information processing systems 32, 2019 | 47 | 2019 |
A provably efficient model-free posterior sampling method for episodic reinforcement learning C Dann, M Mohri, T Zhang, J Zimmert Advances in Neural Information Processing Systems 34, 12040-12051, 2021 | 39 | 2021 |
Beyond value-function gaps: Improved instance-dependent regret bounds for episodic reinforcement learning C Dann, TV Marinov, M Mohri, J Zimmert Advances in Neural Information Processing Systems 34, 1-12, 2021 | 37 | 2021 |
The pareto frontier of model selection for general contextual bandits TV Marinov, J Zimmert Advances in Neural Information Processing Systems 34, 17956-17967, 2021 | 24 | 2021 |
Safe screening for support vector machines J Zimmert, CS de Witt, G Kerg, M Kloft NIPS 2015 Workshop on Optimization in Machine Learning (OPT), 2015 | 24 | 2015 |
Pushing the efficiency-regret pareto frontier for online learning of portfolios and quantum states J Zimmert, N Agarwal, S Kale Conference on Learning Theory, 182-226, 2022 | 20 | 2022 |
Factored bandits J Zimmert, Y Seldin Advances in Neural Information Processing Systems 31, 2018 | 20 | 2018 |
A blackbox approach to best of both worlds in bandits and beyond C Dann, CY Wei, J Zimmert The Thirty Sixth Annual Conference on Learning Theory, 5503-5570, 2023 | 18 | 2023 |
Refined regret for adversarial mdps with linear function approximation Y Dai, H Luo, CY Wei, J Zimmert International Conference on Machine Learning, 6726-6759, 2023 | 17 | 2023 |
A best-of-both-worlds algorithm for bandits with delayed feedback S Masoudian, J Zimmert, Y Seldin Advances in Neural Information Processing Systems 35, 11752-11762, 2022 | 17 | 2022 |
Distributed optimization of multi-class SVMs M Alber, J Zimmert, U Dogan, M Kloft PloS one 12 (6), e0178161, 2017 | 15 | 2017 |
Return of the bias: Almost minimax optimal high probability bounds for adversarial linear bandits J Zimmert, T Lattimore Conference on Learning Theory, 3285-3312, 2022 | 13 | 2022 |
Best of both worlds policy optimization C Dann, CY Wei, J Zimmert International Conference on Machine Learning, 6968-7008, 2023 | 12 | 2023 |