Follow
Kathryn Mohror
Title
Cited by
Cited by
Year
Design, modeling, and evaluation of a scalable multi-level checkpointing system
A Moody, G Bronevetsky, K Mohror, BR De Supinski
SC'10: Proceedings of the 2010 ACM/IEEE International Conference for High …, 2010
7772010
There goes the neighborhood: performance degradation due to nearby jobs
A Bhatele, K Mohror, SH Langer, KE Isaacs
Proceedings of the International Conference on High Performance Computing …, 2013
2332013
Design and modeling of a non-blocking checkpointing system
K Sato, N Maruyama, K Mohror, A Moody, T Gamblin, BR de Supinski, ...
SC'12: Proceedings of the International Conference on High Performance …, 2012
1422012
McrEngine: A scalable checkpointing system using data-aware aggregation and compression
TZ Islam, K Mohror, S Bagchi, A Moody, BR De Supinski, R Eigenmann
SC'12: Proceedings of the International Conference on High Performance …, 2012
1272012
An ephemeral burst-buffer file system for scientific applications
T Wang, K Mohror, A Moody, K Sato, W Yu
SC'16: Proceedings of the International Conference for High Performance …, 2016
1042016
The popper convention: Making reproducible systems evaluation practical
I Jimenez, M Sevilla, N Watkins, C Maltzahn, J Lofstead, K Mohror, ...
2017 ieee international parallel and distributed processing symposium …, 2017
802017
A 1 PB/s file system to checkpoint three million MPI tasks
R Rajachandrasekar, A Moody, K Mohror, DK Panda
Proceedings of the 22nd international symposium on High-performance parallel …, 2013
792013
A user-level infiniband-based file system and checkpoint strategy for burst buffers
K Sato, K Mohror, A Moody, T Gamblin, BR De Supinski, N Maruyama, ...
2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid …, 2014
762014
Veloc: Towards high performance adaptive asynchronous checkpointing at large scale
B Nicolae, A Moody, E Gonsiorowski, K Mohror, F Cappello
2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2019
642019
ADAPT: Algorithmic differentiation applied to floating-point precision tuning
H Menon, MO Lam, D Osei-Kuffuor, M Schordan, S Lloyd, K Mohror, ...
SC18: International Conference for High Performance Computing, Networking …, 2018
642018
A large-scale study of MPI usage in open-source HPC applications
I Laguna, R Marshall, K Mohror, M Ruefenacht, A Skjellum, N Sultana
Proceedings of the International Conference for High Performance Computing …, 2019
602019
Entropy-aware I/O pipelining for large-scale deep learning on HPC systems
Y Zhu, F Chowdhury, H Fu, A Moody, K Mohror, K Sato, W Yu
2018 IEEE 26th International Symposium on Modeling, Analysis, and Simulation …, 2018
552018
Evaluating similarity-based trace reduction techniques for scalable performance analysis
K Mohror, KL Karavanic
Proceedings of the conference on high performance computing networking …, 2009
522009
Evaluating and extending user-level fault tolerance in MPI applications
I Laguna, DF Richards, T Gamblin, M Schulz, BR de Supinski, K Mohror, ...
The International Journal of High Performance Computing Applications 30 (3 …, 2016
502016
I/o characterization and performance evaluation of beegfs for deep learning
F Chowdhury, Y Zhu, T Heer, S Paredes, A Moody, R Goldstone, ...
Proceedings of the 48th International Conference on Parallel Processing, 1-10, 2019
422019
Managing I/O interference in a shared burst buffer system
S Thapaliya, P Bangalore, J Lofstead, K Mohror, A Moody
2016 45th International Conference on Parallel Processing (ICPP), 416-425, 2016
412016
Fmi: Fault tolerant messaging interface for fast and transparent recovery
K Sato, A Moody, K Mohror, T Gamblin, BR de Supinski, N Maruyama, ...
2014 IEEE 28th International Parallel and Distributed Processing Symposium …, 2014
372014
Integrating database technology with comparison-based parallel performance diagnosis: The perftrack performance experiment management tool
KL Karavanic, J May, K Mohror, B Miller, K Huck, R Knapp, B Pugh
SC'05: Proceedings of the 2005 ACM/IEEE conference on Supercomputing, 39-39, 2005
372005
Detailed modeling and evaluation of a scalable multilevel checkpointing system
K Mohror, A Moody, G Bronevetsky, BR De Supinski
IEEE Transactions on Parallel and Distributed Systems 25 (9), 2255-2263, 2013
342013
Mpi sessions: Leveraging runtime infrastructure to increase scalability of applications at exascale
D Holmes, K Mohror, RE Grant, A Skjellum, M Schulz, W Bland, ...
Proceedings of the 23rd European MPI Users' Group Meeting, 121-129, 2016
302016
The system can't perform the operation now. Try again later.
Articles 1–20