Layoutlm: Pre-training of text and layout for document image understanding Y Xu, M Li, L Cui, S Huang, F Wei, M Zhou Proceedings of the 26th ACM SIGKDD international conference on knowledge …, 2020 | 803 | 2020 |
Kosmos-2: Grounding multimodal large language models to the world Z Peng, W Wang, L Dong, Y Hao, S Huang, S Ma, F Wei arXiv preprint arXiv:2306.14824, 2023 | 521 | 2023 |
Superagent: A customer service chatbot for e-commerce websites L Cui, S Huang, F Wei, C Tan, C Duan, M Zhou Proceedings of ACL 2017, system demonstrations, 97-102, 2017 | 493 | 2017 |
Neural document summarization by jointly learning to score and select sentences Q Zhou, N Yang, F Wei, S Huang, M Zhou, T Zhao arXiv preprint arXiv:1807.02305, 2018 | 453 | 2018 |
Language is not all you need: Aligning perception with language models S Huang, L Dong, W Wang, Y Hao, S Singhal, S Ma, T Lv, L Cui, ... Advances in Neural Information Processing Systems 36, 72096-72109, 2023 | 426 | 2023 |
Retentive network: A successor to transformer for large language models Y Sun, L Dong, S Huang, S Ma, Y Xia, J Xue, J Wang, F Wei arXiv preprint arXiv:2307.08621, 2023 | 253 | 2023 |
Learning to generate product reviews from attributes L Dong, S Huang, F Wei, M Lapata, M Zhou, K Xu Proceedings of the 15th Conference of the European Chapter of the …, 2017 | 221 | 2017 |
Minilmv2: Multi-head self-attention relation distillation for compressing pretrained transformers W Wang, H Bao, S Huang, L Dong, F Wei arXiv preprint arXiv:2012.15828, 2020 | 218 | 2020 |
Tablebank: Table benchmark for image-based table detection and recognition M Li, L Cui, S Huang, F Wei, M Zhou, Z Li Proceedings of the Twelfth Language Resources and Evaluation Conference …, 2020 | 213 | 2020 |
DocBank: A benchmark dataset for document layout analysis M Li, Y Xu, L Cui, S Huang, F Wei, Z Li, M Zhou arXiv preprint arXiv:2006.01038, 2020 | 205 | 2020 |
Hitanomaly: Hierarchical transformers for anomaly detection in system log S Huang, Y Liu, C Fung, R He, Y Zhao, H Yang, Z Luan IEEE transactions on network and service management 17 (4), 2064-2076, 2020 | 188 | 2020 |
Promptbert: Improving bert sentence embeddings with prompts T Jiang, J Jiao, S Huang, Z Zhang, D Wang, F Zhuang, F Wei, H Huang, ... arXiv preprint arXiv:2201.04337, 2022 | 179 | 2022 |
Deepnet: Scaling transformers to 1,000 layers H Wang, S Ma, L Dong, S Huang, D Zhang, F Wei IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024 | 154 | 2024 |
Response generation by context-aware prototype editing Y Wu, F Wei, S Huang, Y Wang, Z Li, M Zhou Proceedings of the AAAI Conference on Artificial Intelligence 33 (01), 7281-7288, 2019 | 138 | 2019 |
Language generation with multi-hop reasoning on commonsense knowledge graph H Ji, P Ke, S Huang, F Wei, X Zhu, M Huang arXiv preprint arXiv:2009.11692, 2020 | 136 | 2020 |
The era of 1-bit llms: All large language models are in 1.58 bits S Ma, H Wang, L Ma, L Wang, W Wang, S Huang, L Dong, R Wang, J Xue, ... arXiv preprint arXiv:2402.17764, 2024 | 135 | 2024 |
A length-extrapolatable transformer Y Sun, L Dong, B Patra, S Ma, S Huang, A Benhaim, V Chaudhary, ... arXiv preprint arXiv:2212.10554, 2022 | 134 | 2022 |
XLM-E: Cross-lingual Language Model Pre-training via ELECTRA Z Chi arXiv preprint arXiv:2106.16138, 2021 | 130 | 2021 |
Longnet: Scaling transformers to 1,000,000,000 tokens J Ding, S Ma, L Dong, X Zhang, S Huang, W Wang, N Zheng, F Wei arXiv preprint arXiv:2307.02486, 2023 | 127 | 2023 |
Language models are general-purpose interfaces Y Hao, H Song, L Dong, S Huang, Z Chi, W Wang, S Ma, F Wei arXiv preprint arXiv:2206.06336, 2022 | 104 | 2022 |