Follow
Tim Dettmers
Tim Dettmers
Verified email at cs.washington.edu - Homepage
Title
Cited by
Cited by
Year
Convolutional 2d knowledge graph embeddings
T Dettmers, P Minervini, P Stenetorp, S Riedel
Proceedings of the AAAI conference on artificial intelligence 32 (1), 2018
14952018
Sparse networks from scratch: Faster training without losing performance
T Dettmers, L Zettlemoyer
arXiv preprint arXiv:1907.04840, 2019
1802019
8-bit Approximations for Parallelism in Deep Learning
T Dettmers
4th International Conference on Learning Representations, ICLR 2016, San …, 2016
1442016
Base layers: Simplifying training of large, sparse models
M Lewis, S Bhosale, T Dettmers, N Goyal, L Zettlemoyer
International Conference on Machine Learning, 6265-6274, 2021
492021
Jack the reader-A machine reading framework
D Weissenborn, P Minervini, T Dettmers, I Augenstein, J Welbl, ...
arXiv preprint arXiv:1806.08727, 2018
92018
8-bit Optimizers via Block-wise Quantization
T Dettmers, M Lewis, S Shleifer, L Zettlemoyer
9th International Conference on Learning Representations, ICLR 2022, Virtual …, 2022
62022
High performance natural language processing
G Ilharco, C Ilharco, I Turc, T Dettmers, F Ferreira, K Lee
Proceedings of the 2020 Conference on Empirical Methods in Natural Language …, 2020
22020
Petals: Collaborative Inference and Fine-tuning of Large Models
A Borzunov, D Baranchuk, T Dettmers, M Ryabinin, Y Belkada, ...
arXiv preprint arXiv:2209.01188, 2022
2022
LLM. int8 (): 8-bit Matrix Multiplication for Transformers at Scale
T Dettmers, M Lewis, Y Belkada, L Zettlemoyer
arXiv preprint arXiv:2208.07339, 2022
2022
Branch-Train-Merge: Embarrassingly Parallel Training of Expert Language Models
M Li, S Gururangan, T Dettmers, M Lewis, T Althoff, NA Smith, ...
arXiv preprint arXiv:2208.03306, 2022
2022
Training Transformers Together
A Borzunov, M Ryabinin, T Dettmers, Q Lhoest, L Saulnier, M Diskin, ...
NeurIPS 2021 Competitions and Demonstrations Track, 335-342, 2022
2022
SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient
M Ryabinin, T Dettmers, M Diskin, A Borzunov
2021
The system can't perform the operation now. Try again later.
Articles 1–12