Follow
Tri Dao
Tri Dao
Computer Science, Stanford University
Verified email at stanford.edu - Homepage
Title
Cited by
Cited by
Year
A kernel theory of modern data augmentation
T Dao, A Gu, A Ratner, V Smith, CD Sa, C Ré
Proceedings of the 36th International Conference on Machine Learning, ICML, 9-15, 2019
1152019
Learning fast algorithms for linear transforms using butterfly factorizations
T Dao, A Gu, M Eichhorn, A Rudra, C Ré
International conference on machine learning, 1517-1527, 2019
562019
Gaussian quadrature for kernel features
T Dao, CM De Sa, C Ré
Advances in neural information processing systems 30, 2017
392017
Learning compressed transforms with low displacement rank
A Thomas, A Gu, T Dao, A Rudra, C Ré
Advances in neural information processing systems 31, 2018
382018
Hippo: Recurrent memory with optimal polynomial projections
A Gu, T Dao, S Ermon, A Rudra, C Ré
Advances in Neural Information Processing Systems 33, 1474-1487, 2020
372020
Low-precision random Fourier features for memory-constrained kernel approximation
J Zhang, A May, T Dao, C Ré
The 22nd International Conference on Artificial Intelligence and Statistics …, 2019
312019
Kaleidoscope: An efficient, learnable representation for all structured linear maps
T Dao, NS Sohoni, A Gu, M Eichhorn, A Blonder, M Leszczynski, A Rudra, ...
arXiv preprint arXiv:2012.14966, 2020
252020
MONGOOSE: A learnable LSH framework for efficient neural network training
B Chen, Z Liu, B Peng, Z Xu, JL Li, T Dao, Z Song, A Shrivastava, C Re
International Conference on Learning Representations, 2020
252020
On the downstream performance of compressed word embeddings
A May, J Zhang, T Dao, C Ré
Advances in neural information processing systems 32, 2019
192019
Scatterbrain: Unifying sparse and low-rank attention
B Chen, T Dao, E Winsor, Z Song, A Rudra, C Ré
Advances in Neural Information Processing Systems 34, 17413-17426, 2021
11*2021
Combining recurrent, convolutional, and continuous-time models with linear state space layers
A Gu, I Johnson, K Goel, K Saab, T Dao, A Rudra, C Ré
Advances in neural information processing systems 34, 572-585, 2021
82021
Adaptive hashing for model counting
J Kuck, T Dao, S Zhao, B Bartan, A Sabharwal, S Ermon
Uncertainty in Artificial Intelligence, 271-280, 2020
82020
Pixelated butterfly: Simple and efficient sparse training for neural network models
B Chen, T Dao, K Liang, J Yang, Z Song, A Rudra, C Re
arXiv preprint arXiv:2112.00029, 2021
72021
Rethinking neural operations for diverse tasks
N Roberts, M Khodak, T Dao, L Li, C Ré, A Talwalkar
Advances in Neural Information Processing Systems 34, 15855-15869, 2021
62021
Approximating the permanent by sampling from adaptive partitions
J Kuck, T Dao, H Rezatofighi, A Sabharwal, S Ermon
Advances in neural information processing systems 32, 2019
62019
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
T Dao, DY Fu, S Ermon, A Rudra, C Ré
arXiv preprint arXiv:2205.14135, 2022
42022
Catformer: Designing stable transformers via sensitivity analysis
JQ Davis, A Gu, K Choromanski, T Dao, C Re, C Finn, P Liang
International Conference on Machine Learning, 2489-2499, 2021
42021
Knowledge Distillation As Semiparametric Inference
T Dao, GM Kamath, V Syrgkanis, L Mackey
arXiv preprint arXiv:2104.09732, 2021
42021
Monarch: Expressive structured matrices for efficient and accurate training
T Dao, B Chen, NS Sohoni, A Desai, M Poli, J Grogan, A Liu, A Rao, ...
International Conference on Machine Learning, 4690-4721, 2022
32022
Learning operations for neural PDE solvers
N Roberts, M Khodak, T Dao, L Li, C Ré, A Talwalkar
Proc. ICLR SimDL Workshop, 2021
22021
The system can't perform the operation now. Try again later.
Articles 1–20