Bei Liu

Cited by

	All	Since 2019
Citations	1693	1690
h-index	17	17
i10-index	21	21

720

360

180

540

20192020202120222023202420 67 200 450 703 250

Public access

View all

10 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Jianlong FuMicrosoft ResearchVerified email at microsoft.com
Zhaoyang ZengInternational Digital Economy AcademyVerified email at idea.edu.cn
Hongwei XueUniversity of Science and Technology of ChinaVerified email at mail.ustc.edu.cn
Ruihua SongRenmin University of ChinaVerified email at ruc.edu.cn
Huan YangMicrosoft Research AsiaVerified email at fastmail.com
Jiebo LuoAlbert Arendt Hopeman Professor of Engineering, University of RochesterVerified email at cs.rochester.edu
Makoto P. KatoUniversity of TsukubaVerified email at acm.org
Masatoshi YoshikawaOsaka Seikei UniversityVerified email at osaka-seikei.ac.jp
Katsumi TanakaKyoto UniversityVerified email at fukuchiyama.ac.jp

Bei Liu

Microsoft Research

Verified email at microsoft.com

multimodal learning


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Pixel-bert: Aligning image pixels with text by deep multi-modal transformers Z Huang, Z Zeng, B Liu, D Fu, J Fu arXiv preprint arXiv:2004.00849, 2020	407	2020
Seeing out of the box: End-to-end pre-training for vision-language representation learning Z Huang, Z Zeng, Y Huang, B Liu, D Fu, J Fu Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2021	248	2021
Wsod2: Learning bottom-up and top-down objectness distillation for weakly-supervised object detection Z Zeng, B Liu, J Fu, H Chao, L Zhang Proceedings of the IEEE/CVF international conference on computer vision …, 2019	158	2019
M3p: Learning universal representations via multitask multilingual multimodal pre-training M Ni, H Huang, L Su, E Cui, T Bharti, L Wang, D Zhang, N Duan Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2021	98	2021
Advancing high-resolution video-language representation with large-scale video transcriptions H Xue, T Hang, Y Zeng, Y Sun, B Liu, H Yang, J Fu, B Guo Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022	94	2022
Clip-vip: Adapting pre-trained image-text model to video-language alignment H Xue, Y Sun, B Liu, J Fu, R Song, H Li, J Luo The Eleventh International Conference on Learning Representations, 2022	90*	2022
Beyond narrative description: Generating poetry from images by multi-adversarial training B Liu, J Fu, MP Kato, M Yoshikawa Proceedings of the 26th ACM international conference on Multimedia, 783-791, 2018	82	2018
Probing inter-modality: Visual parsing with self-attention for vision-and-language pre-training H Xue, Y Huang, B Liu, H Peng, J Fu, H Li, J Luo Advances in Neural Information Processing Systems 34, 4514-4528, 2021	79	2021
Mm-diffusion: Learning multi-modal diffusion models for joint audio and video generation L Ruan, Y Ma, H Yang, H He, B Liu, J Fu, NJ Yuan, Q Jin, B Guo Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023	74	2023
Unifying multimodal transformer for bi-directional image and text generation Y Huang, H Xue, B Liu, Y Lu Proceedings of the 29th ACM International Conference on Multimedia, 1138-1147, 2021	55	2021
Long-form video-language pre-training with multimodal temporal contrastive learning Y Sun, H Xue, R Song, B Liu, H Yang, J Fu Advances in neural information processing systems 35, 38032-38045, 2022	45	2022
Searching the search space of vision transformer M Chen, K Wu, B Ni, H Peng, B Liu, J Fu, H Chao, H Ling Advances in Neural Information Processing Systems 34, 8714-8726, 2021	42	2021
Aesthetic-aware image style transfer Z Hu, J Jia, B Liu, Y Bu, J Fu Proceedings of the 28th ACM International Conference on Multimedia, 3320-3329, 2020	29	2020
Smp challenge: An overview of social media prediction challenge 2019 B Wu, WH Cheng, P Liu, B Liu, Z Zeng, J Luo Proceedings of the 27th ACM International Conference on Multimedia, 2667-2671, 2019	29	2019
Reference-based defect detection network Z Zeng, B Liu, J Fu, H Chao IEEE Transactions on Image Processing 30, 6637-6647, 2021	27	2021
Neural storyboard artist: Visualizing stories with coherent image sequences S Chen, B Liu, J Fu, R Song, Q Jin, P Lin, X Qi, C Wang, J Zhou Proceedings of the 27th ACM International Conference on Multimedia, 2236-2244, 2019	27	2019
Emotion reinforced visual storytelling N Li, B Liu, Z Han, YS Liu, J Fu Proceedings of the 2019 on International Conference on Multimedia Retrieval …, 2019	22	2019
Pave the way to grasp anything: Transferring foundation models for universal pick-place robots J Yang, W Tan, C Jin, B Liu, J Fu, R Song, L Wang arXiv preprint arXiv:2306.05716, 2023	16	2023
Alphablock: Embodied finetuning for vision-language reasoning in robot manipulation C Jin, W Tan, J Yang, B Liu, R Song, L Wang, J Fu arXiv preprint arXiv:2305.18898, 2023	13	2023
Activitynet 2019 task 3: Exploring contexts for dense captioning events in videos S Chen, Y Song, Y Zhao, Q Jin, Z Zeng, B Liu, J Fu, A Hauptmann arXiv preprint arXiv:1907.05092, 2019	12	2019

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors