Rif A. Saurous
Rif A. Saurous
Verified email at
Cited by
Cited by
Natural tts synthesis by conditioning wavenet on mel spectrogram predictions
J Shen, R Pang, RJ Weiss, M Schuster, N Jaitly, Z Yang, Z Chen, Y Zhang, ...
2018 IEEE international conference on acoustics, speech and signal …, 2018
CNN architectures for large-scale audio classification
S Hershey, S Chaudhuri, DPW Ellis, JF Gemmeke, A Jansen, RC Moore, ...
2017 ieee international conference on acoustics, speech and signal …, 2017
Tacotron: Towards end-to-end speech synthesis
Y Wang, RJ Skerry-Ryan, D Stanton, Y Wu, RJ Weiss, N Jaitly, Z Yang, ...
arXiv preprint arXiv:1703.10135, 2017
Style tokens: Unsupervised style modeling, control and transfer in end-to-end speech synthesis
Y Wang, D Stanton, Y Zhang, RJS Ryan, E Battenberg, J Shor, Y Xiao, ...
International Conference on Machine Learning, 5180-5189, 2018
Towards end-to-end prosody transfer for expressive speech synthesis with tacotron
RJ Skerry-Ryan, E Battenberg, Y Xiao, Y Wang, D Stanton, J Shor, ...
international conference on machine learning, 4693-4702, 2018
Fixing a broken ELBO
A Alemi, B Poole, I Fischer, J Dillon, RA Saurous, K Murphy
International conference on machine learning, 159-168, 2018
Tensorflow distributions
JV Dillon, I Langmore, D Tran, E Brevdo, S Vasudevan, D Moore, B Patton, ...
arXiv preprint arXiv:1711.10604, 2017
Voicefilter: Targeted voice separation by speaker-conditioned spectrogram masking
Q Wang, H Muckenhirn, K Wilson, P Sridhar, Z Wu, J Hershey, ...
arXiv preprint arXiv:1810.04826, 2018
Deep probabilistic programming
D Tran, MD Hoffman, RA Saurous, E Brevdo, K Murphy, DM Blei
arXiv preprint arXiv:1701.03757, 2017
Unsupervised learning of semantic audio representations
A Jansen, M Plakal, R Pandya, DPW Ellis, S Hershey, J Liu, RC Moore, ...
2018 IEEE international conference on acoustics, speech and signal …, 2018
Trainable frontend for robust and far-field keyword spotting
Y Wang, P Getreuer, T Hughes, RF Lyon, RA Saurous
2017 IEEE International Conference on Acoustics, Speech and Signal …, 2017
Scalable learning of non-decomposable objectives
E Eban, M Schain, A Mackey, A Gordon, R Rifkin, G Elidan
Artificial intelligence and statistics, 832-840, 2017
Beyond the imitation game: Quantifying and extrapolating the capabilities of language models
A Srivastava, A Rastogi, A Rao, AAM Shoeb, A Abid, A Fisch, AR Brown, ...
arXiv preprint arXiv:2206.04615, 2022
Differentiable consistency constraints for improved deep speech enhancement
S Wisdom, JR Hershey, K Wilson, J Thorpe, M Chinen, B Patton, ...
ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019
Uncovering latent style factors for expressive speech synthesis
Y Wang, RJ Skerry-Ryan, Y Xiao, D Stanton, J Shor, E Battenberg, ...
arXiv preprint arXiv:1711.00520, 2017
Simple, distributed, and accelerated probabilistic programming
D Tran, MW Hoffman, D Moore, C Suter, S Vasudevan, A Radul
Advances in Neural Information Processing Systems 31, 2018
AutoMOS: Learning a non-intrusive assessor of naturalness-of-speech
B Patton, Y Agiomyrgiannakis, M Terry, K Wilson, RA Saurous, D Sculley
arXiv preprint arXiv:1611.09207, 2016
Emotion Recognition from Human Speech Using Temporal Information and Deep Learning.
J Kim, RA Saurous
Interspeech, 937-940, 2018
An information-theoretic analysis of deep latent-variable models
A Alemi, B Poole, I Fischer, J Dillon, RA Saurus, K Murphy
Exploring tradeoffs in models for low-latency speech enhancement
K Wilson, M Chinen, J Thorpe, B Patton, J Hershey, RA Saurous, ...
2018 16th International Workshop on Acoustic Signal Enhancement (IWAENC …, 2018
The system can't perform the operation now. Try again later.
Articles 1–20