Semantic segmentation-based building footprint extraction using very high-resolution satellite images and multi-source GIS data W Li, C He, J Fang, J Zheng, H Fu, L Yu Remote Sensing 11 (4), 403, 2019 | 243 | 2019 |
Turbotransformers: an efficient gpu serving system for transformer models J Fang, Y Yu, C Zhao, J Zhou Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of …, 2021 | 126 | 2021 |
Colossal-ai: A unified deep learning system for large-scale parallel training S Li, H Liu, Z Bian, J Fang, H Huang, Y Liu, B Wang, Y You Proceedings of the 52nd International Conference on Parallel Processing, 766-775, 2023 | 102 | 2023 |
swdnn: A library for accelerating deep learning applications on sunway taihulight J Fang, H Fu, W Zhao, B Chen, W Zheng, G Yang 2017 IEEE international parallel and distributed processing symposium (IPDPS …, 2017 | 84 | 2017 |
Semantic segmentation based building extraction method using multi-source gis map datasets and satellite imagery W Li, C He, J Fang, H Fu Proceedings of the IEEE conference on computer vision and pattern …, 2018 | 48 | 2018 |
Refactoring and optimizing the community atmosphere model (CAM) on the sunway taihulight supercomputer H Fu, J Liao, W Xue, L Wang, D Chen, L Gu, J Xu, N Ding, X Wang, C He, ... SC'16: Proceedings of the International Conference for High Performance …, 2016 | 43 | 2016 |
swcaffe: A parallel framework for accelerating deep learning applications on sunway taihulight L Li, J Fang, H Fu, J Jiang, W Zhao, C He, X You, G Yang 2018 IEEE International Conference on Cluster Computing (CLUSTER), 413-422, 2018 | 36 | 2018 |
RedSync: reducing synchronization bandwidth for distributed deep learning training system J Fang, H Fu, G Yang, CJ Hsieh Journal of Parallel and Distributed Computing 133, 30-39, 2019 | 34 | 2019 |
A parallel finite-element time-domain method for transient electromagnetic simulation H Fu, Y Wang, ES Um, J Fang, T Wei, X Huang, G Yang Geophysics 80 (4), E213-E224, 2015 | 33 | 2015 |
Parallel training of pre-trained models via chunk-based dynamic memory management J Fang, Z Zhu, S Li, H Su, Y Yu, J Zhou, Y You IEEE Transactions on Parallel and Distributed Systems 34 (1), 304-315, 2022 | 31 | 2022 |
Rocbert: Robust chinese bert with multimodal contrastive pretraining H Su, W Shi, X Shen, Z Xiao, T Ji, J Fang, J Zhou Proceedings of the 60th Annual Meeting of the Association for Computational …, 2022 | 31 | 2022 |
Fastfold: Reducing alphafold training time from 11 days to 67 hours S Cheng, X Zhao, G Lu, J Fang, Z Yu, T Zheng, R Wu, X Zhang, J Peng, ... arXiv preprint arXiv:2203.00854, 2022 | 23 | 2022 |
Optimizing convolutional neural networks on the sunway taihulight supercomputer W Zhao, H Fu, J Fang, W Zheng, L Gan, G Yang ACM Transactions on Architecture and Code Optimization (TACO) 15 (1), 1-26, 2018 | 22 | 2018 |
Parallel multiclass support vector machine for remote sensing data classification on multicore and many-core architectures W Li, H Fu, Y You, L Yu, J Fang IEEE Journal of Selected Topics in Applied Earth Observations and Remote …, 2017 | 14 | 2017 |
A dynamic agricultural prediction system for large-scale drought assessment on the Sunway TaihuLight supercomputer X Huang, C Yu, J Fang, G Huang, S Ni, J Hall, C Zorn, X Huang, W Zhang Computers and electronics in agriculture 154, 400-410, 2018 | 10 | 2018 |
Colossal-auto: Unified automation of parallelization and activation checkpoint for large-scale models Y Liu, S Li, J Fang, Y Shao, B Yao, Y You arXiv preprint arXiv:2302.02599, 2023 | 8 | 2023 |
Elixir: Train a large language model on a small gpu cluster H Huang, J Fang, H Liu, S Li, Y You arXiv preprint arXiv:2212.05339, 2022 | 6 | 2022 |
Efficient AES implementation on Sunway TaihuLight supercomputer: A systematic approach L Li, J Fang, J Jiang, L Gan, W Zheng, H Fu, G Yang Journal of Parallel and Distributed Computing 138, 178-189, 2020 | 6 | 2020 |
SW-AES: accelerating AES algorithm on the sunway taihulight L Li, J Fang, J Jiang, L Gan, W Zheng, H Fu, G Yang 2017 IEEE International Symposium on Parallel and Distributed Processing …, 2017 | 5 | 2017 |
Optimizing complex spatially-variant coefficient stencils for seismic modeling on GPU J Fang, H Fu, H Zhang, W Wu, N Dai, L Gan, G Yang 2015 IEEE 21st International Conference on Parallel and Distributed Systems …, 2015 | 4 | 2015 |