Volgen
Teng Wang
Teng Wang
Department of Computer Science, The University of Hong Kong
Geverifieerd e-mailadres voor connect.hku.hk - Homepage
Titel
Geciteerd door
Geciteerd door
Jaar
End-to-end dense video captioning with parallel decoding
T Wang, R Zhang, Z Lu, F Zheng, R Cheng, P Luo
ICCV 2021, 6847-6857, 2021
1432021
Event-centric hierarchical representation for dense video captioning
T Wang, H Zheng, M Yu, Q Tian, H Hu
IEEE Transactions on Circuits and Systems for Video Technology 31 (5), 1890-1900, 2020
632020
Caption anything: Interactive image description with diverse multimodal controls
T Wang*, J Zhang*, J Fei*, Y Ge, H Zheng, Y Tang, Z Li, M Gao, S Zhao, ...
arXiv preprint arXiv:2305.02677, 2023
382023
VLMixer: Unpaired Vision-Language Pre-training via Cross-Modal CutMix
T Wang, W Jiang, Z Lu, F Zheng, R Cheng, C Yin, P Luo
ICML 2022, 2022
252022
Set-level guidance attack: Boosting adversarial transferability of vision-language pre-training models
D Lu, Z Wang, T Wang, W Guan, H Gao, F Zheng
Proceedings of the IEEE/CVF International Conference on Computer Vision, 102-111, 2023
112023
Knowledge-aware prompt tuning for generalizable vision-language models
B Kan, T Wang, W Lu, X Zhen, W Guan, F Zheng
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023
102023
Dense-captioning events in videos: Sysu submission to activitynet challenge 2020
T Wang, H Zheng, M Yu
CVPR Workshops, 2020
102020
Image caption with endogenous–exogenous attention
T Wang, H Hu, C He
Neural Processing Letters 50, 431-443, 2019
102019
-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation
C Wu, T Wang, Y Ge, Z Lu, R Zhou, Y Shan, P Luo
International Conference on Machine Learning, 37713-37727, 2023
92023
Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline
T Geng, T Wang, J Duan, R Cong, F Zheng
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023
92023
Video understanding with large language models: A survey
Y Tang, J Bi, S Xu, L Song, S Liang, T Wang, D Zhang, J An, J Lin, R Zhu, ...
arXiv preprint arXiv:2312.17432, 2023
72023
Accelerating Vision-Language Pretraining with Free Language Modeling
T Wang, Y Ge, F Zheng, R Cheng, Y Shan, X Qie, P Luo
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023
62023
Transferable decoding with visual entities for zero-shot image captioning
J Fei, T Wang, J Zhang, Z He, C Wang, F Zheng
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023
62023
Llmva-gebc: Large language model with video adapter for generic event boundary captioning
Y Tang, J Zhang, X Wang, T Wang, F Zheng
arXiv preprint arXiv:2306.10354, 2023
42023
Learning Grounded Vision-Language Representation for Versatile Understanding in Untrimmed Videos
T Wang*, J Zhang*, F Zheng, W Jiang, R Cheng, P Luo
arXiv preprint arXiv:2303.06378, 2023
42023
Multi-modal segment assemblage network for ad video editing with importance-coherence reward
Y Tang, S Xu, T Wang, Q Lin, Q Lu, F Zheng
Proceedings of the Asian Conference on Computer Vision, 3519-3535, 2022
42022
Semantic-aware pretraining for dense video captioning
T Wang, Z Liu, F Zheng, Z Lu, R Cheng, P Luo
arXiv preprint arXiv:2204.07449, 2022
32022
PTVD: A Large-Scale Plot-Oriented Multimodal Dataset Based on Television Dramas
C Li, X Peng, T Wang, Y Ge, M Liu, X Xu, Y Wang, Y Shan
arXiv preprint arXiv:2306.14644, 2023
12023
Show, Tell and Rephrase: Diverse Video Captioning via Two-Stage Progressive Training
Z Liu, T Wang, J Zhang, F Zheng, W Jiang, K Lu
IEEE Transactions on Multimedia, 2022
12022
UniAV: Unified Audio-Visual Perception for Multi-Task Video Localization
T Geng, T Wang, Y Zhang, J Duan, W Guan, F Zheng
arXiv preprint arXiv:2404.03179, 2024
2024
Het systeem kan de bewerking nu niet uitvoeren. Probeer het later opnieuw.
Artikelen 1–20