Investigating local and global information for automated audio captioning with transfer learning X Xu, H Dinkel, M Wu, Z Xie, K Yu ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 57 | 2021 |
A CRNN-GRU Based Reinforcement Learning Approach to Audio Captioning. X Xu, H Dinkel, M Wu, K Yu DCASE, 225-229, 2020 | 46 | 2020 |
Predicting tensile properties of AZ31 magnesium alloys by machine learning X Xu, L Wang, G Zhu, X Zeng Jom 72 (11), 3935-3942, 2020 | 44 | 2020 |
Voice activity detection in the wild: A data-driven approach using teacher-student training H Dinkel, S Wang, X Xu, M Wu, K Yu IEEE/ACM Transactions on Audio, Speech, and Language Processing 29, 1542-1555, 2021 | 40 | 2021 |
Can audio captions be evaluated with image caption metrics? Z Zhou, Z Zhang, X Xu, Z Xie, M Wu, KQ Zhu ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 33 | 2022 |
The SJTU system for DCASE2022 challenge task 6: Audio captioning with audio-text retrieval pre-training X Xu, Z Xie, M Wu, K Yu DCASE 2022 Challenge, Tech. Rep., 2022 | 29 | 2022 |
Audio-text retrieval in context S Lou, X Xu, M Wu, K Yu ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 23 | 2022 |
Text-to-audio grounding: Building correspondence between captions and sound events X Xu, H Dinkel, M Wu, K Yu ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 18 | 2021 |
Audio caption in a car setting with a sentence-level loss X Xu, H Dinkel, M Wu, K Yu 2021 12th International Symposium on Chinese Spoken Language Processing …, 2021 | 17 | 2021 |
The SJTU system for DCASE2021 challenge task 6: Audio captioning based on encoder pre-training and reinforcement learning X Xu, Z Xie, M Wu, K Yu DCASE2021 Challenge, Tech. Rep, Tech. Rep, 2021 | 16 | 2021 |
A comprehensive survey of automated audio captioning X Xu, M Wu, K Yu arXiv preprint arXiv:2205.05357, 2022 | 13 | 2022 |
Sound-based construction activity monitoring with deep learning W Xiong, X Xu, L Chen, J Yang Buildings 12 (11), 1947, 2022 | 11 | 2022 |
Automatic detection pipeline for accessing the motor severity of Parkinson’s disease in finger tapping and postural stability N Yang, DF Liu, T Liu, T Han, P Zhang, X Xu, S Lou, HG Liu, AC Yang, ... IEEE Access 10, 66961-66973, 2022 | 10 | 2022 |
Blat: Bootstrapping language-audio pre-training based on audioset tag-guided synthetic data X Xu, Z Zhang, Z Zhou, P Zhang, Z Xie, M Wu, KQ Zhu Proceedings of the 31st ACM International Conference on Multimedia, 2756-2764, 2023 | 8 | 2023 |
A Lightweight Framework for Online Voice Activity Detection in the Wild. X Xu, H Dinkel, M Wu, K Yu Interspeech, 371-375, 2021 | 8 | 2021 |
Enhance temporal relations in audio captioning with sound event detection Z Xie, X Xu, M Wu, K Yu arXiv preprint arXiv:2306.01533, 2023 | 7 | 2023 |
Diversity-controllable and accurate audio captioning based on neural condition X Xu, M Wu, K Yu ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 4 | 2022 |
A Large-scale Dataset for Audio-Language Representation Learning L Sun, X Xu, M Wu, W Xie arXiv preprint arXiv:2309.11500, 2023 | 2 | 2023 |
Investigating Pooling Strategies and Loss Functions for Weakly-Supervised Text-to-Audio Grounding via Contrastive Learning X Xu, M Wu, K Yu 2023 IEEE International Conference on Acoustics, Speech, and Signal …, 2023 | 2 | 2023 |
Diverse and vivid sound generation from text descriptions G Li, X Xu, L Dai, M Wu, K Yu ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 2 | 2023 |