Follow
Guangxiang Zhao
Guangxiang Zhao
Verified email at pku.edu.cn - Homepage
Title
Cited by
Cited by
Year
Understanding and improving layer normalization
J Xu, X Sun, Z Zhang, G Zhao, J Lin
NeurIPS 2019, 2019
4562019
Explicit sparse transformer: Concentrated attention through explicit selection
G Zhao, J Lin, Z Zhang, X Ren, Q Su, X Sun
arXiv preprint arXiv:1912.11637, 2019
1522019
Topology-Imbalance Learning for Semi-Supervised Node Classification
D Chen, Y Lin, G Zhao, X Ren, P Li, J Zhou, X Sun
NeurIPS 2021, 2021
1052021
Muse: Parallel multi-scale attention for sequence to sequence learning
G Zhao, X Sun, J Xu, Z Zhang, L Luo
arXiv preprint arXiv:1911.09483, 2019
622019
Learning Relation Alignment for Calibrated Cross-modal Retrieval
S Ren, J Lin, G Zhao, R Men, A Yang, J Zhou, X Sun, H Yang
ACL 2021, 2021
342021
Delving into the Openness of CLIP
S Ren, L Li, X Ren, G Zhao, X Sun
Findings of ACL 2023, 2023
23*2023
Layer-Wise Multi-View Decoding for Improved Natural Language Generation
F Liu, X Ren, G Zhao, C You, X Wu, X Sun
arXiv preprint arXiv:2005.08081, 2022
19*2022
Well-classified Examples are Underestimated in Classification with Deep Neural Networks
G Zhao, W Yang, X Ren, L Li, X Sun
AAAI 2022 (arXiv preprint arXiv:2110.06537), 2021
192021
Review-Driven Multi-Label Music Style Classification by Exploiting Style Correlations
G Zhao, J Xu, Q Zeng, X Ren, X Sun
NAACL-HLT 2019, 2019
12*2019
From Mimicking to Integrating: Knowledge Integration for Pre-Trained Language Models
L Li, Y Lin, X Ren, G Zhao, P Li, J Zhou, X Sun
Findings of EMNLP 2022, 2021
3*2021
When to Trust Aggregated Gradients: Addressing Negative Client Sampling in Federated Learning
W Yang, Y Lin, G Zhao, P Li, J Zhou, X Sun
TMLR, 2023
12023
TinyR1-32B-Preview: Boosting Accuracy with Branch-Merge Distillation
L Sun, G Zhao, X Jian, Y Wu, W Lin, Y Zhu, L Zhang, J Wu, J Ran, S Hu, ...
arXiv preprint arXiv:2503.04872, 2025
2025
Chain-of-Thought Matters: Improving Long-Context Language Models with Reasoning Path Supervision
D Zhu, X Wei, G Zhao, W Wu, H Zou, J Ran, X Wang, L Sun, X Zhang, S Li
arXiv preprint arXiv:2502.20790, 2025
2025
LongAttn: Selecting Long-context Training Data via Token-level Attention
L Wu, D Zhu, G Zhao, Z Yu, J Ran, X Wong, L Sun, S Li
arXiv preprint arXiv:2502.16860, 2025
2025
Stress Testing Generalization: How Minor Modifications Undermine Large Language Model Performance
G Zhao, S Hu, X Jian, J Wu, Y Wu, L Sun, X Zhang
arXiv preprint arXiv:2502.12459, 2025
2025
The system can't perform the operation now. Try again later.
Articles 1–15