Exploring automatic, online failure recovery for scientific applications at extreme scales M Gamell, DS Katz, H Kolla, J Chen, S Klasky, M Parashar SC'14: Proceedings of the International Conference for High Performance …, 2014 | 129 | 2014 |
Energy-efficient thermal-aware autonomic management of virtualized HPC cloud infrastructure I Rodero, H Viswanathan, EK Lee, M Gamell, D Pompili, M Parashar Journal of Grid Computing 10, 447-473, 2012 | 90 | 2012 |
Exploring power behaviors and trade-offs of in-situ data analytics M Gamell, I Rodero, M Parashar, JC Bennett, H Kolla, J Chen, PT Bremer, ... Proceedings of the International Conference on High Performance Computing …, 2013 | 68 | 2013 |
Energy-aware application-centric vm allocation for hpc workloads H Viswanathan, EK Lee, I Rodero, D Pompili, M Parashar, M Gamell 2011 IEEE International Symposium on Parallel and Distributed Processing …, 2011 | 67 | 2011 |
Local recovery and failure masking for stencil-based applications at extreme scales M Gamell, K Teranishi, MA Heroux, J Mayo, H Kolla, J Chen, M Parashar Proceedings of the international conference for high performance computing …, 2015 | 56 | 2015 |
ASC ATDM Level 2 Milestone# 5325: Asynchronous Many-Task Runtime System Analysis and Assessment for Next Generation Platforms. GM Baker, MT Bettencourt, SW Bova, K Franko, M Gamell, R Grant, ... Sandia National Lab.(SNL-CA), Livermore, CA (United States); Sandia National …, 2015 | 44 | 2015 |
Towards energy-efficient reactive thermal management in instrumented datacenters I Rodero, EK Lee, D Pompili, M Parashar, M Gamell, RJ Figueiredo 2010 11th IEEE/ACM International Conference on Grid Computing, 321-328, 2010 | 32 | 2010 |
Practical scalable consensus for pseudo-synchronous distributed systems T Herault, A Bouteiller, G Bosilca, M Gamell, K Teranishi, M Parashar, ... Proceedings of the International Conference for High Performance Computing …, 2015 | 29 | 2015 |
Exploring energy and performance behaviors of data-intensive scientific workflows on systems with deep memory hierarchies M Gamell, I Rodero, M Parashar, S Poole 20th Annual International Conference on High Performance Computing, 226-235, 2013 | 26 | 2013 |
Evaluating online global recovery with fenix using application-aware in-memory checkpointing techniques M Gamell, DS Katz, K Teranishi, MA Heroux, RF Van der Wijngaart, ... 2016 45th International Conference on Parallel Processing Workshops (ICPPW …, 2016 | 22 | 2016 |
Scalable data resilience for in-memory data staging S Duan, P Subedi, K Teranishi, P Davis, H Kolla, M Gamell, M Parashar 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2018 | 16 | 2018 |
Modeling and simulating multiple failure masking enabled by local recovery for stencil-based applications at extreme scales M Gamell, K Teranishi, J Mayo, H Kolla, MA Heroux, J Chen, M Parashar IEEE Transactions on Parallel and Distributed Systems 28 (10), 2881-2895, 2017 | 16 | 2017 |
Exploring cross-layer power management for pgas applications on the scc platform M Gamell, I Rodero, M Parashar, R Muralidhar Proceedings of the 21st international symposium on High-Performance Parallel …, 2012 | 16 | 2012 |
CoREC: Scalable and resilient in-memory data staging for in-situ workflows S Duan, P Subedi, P Davis, K Teranishi, H Kolla, M Gamell, M Parashar ACM Transactions on Parallel Computing (TOPC) 7 (2), 1-29, 2020 | 14 | 2020 |
Exploring failure recovery for stencil-based applications at extreme scales M Gamell, K Teranishi, MA Heroux, J Mayo, H Kolla, J Chen, M Parashar Proceedings of the 24th International Symposium on High-Performance Parallel …, 2015 | 14 | 2015 |
Specification of Fenix MPI Fault Tolerance library version 0.9. M Gamell, R Van der Wijingarrt, K Teranishi, M Parashar Sandia National Lab.(SNL-NM), Albuquerque, NM (United States), 2016 | 10 | 2016 |
Fenix A Portable Flexible Fault Tolerance Programming Framework for MPI Applications. R Van Der Wijngaart, M Gamell, K Teranishi, E Valenzuela, MA Heroux, ... Sandia National Lab.(SNL-NM), Albuquerque, NM (United States), 2016 | 8 | 2016 |
Scalable failure masking for stencil computations using ghost region expansion and cell to rank remapping M Gamell, K Teranishi, H Kolla, J Mayo, MA Heroux, J Chen, M Parashar SIAM Journal on Scientific Computing 39 (5), S347-S378, 2017 | 4 | 2017 |
Specification of Fenix MPI Fault Tolerance library (V. 1.0) M Gamble, R Van Der Wijngaart, K Teranishi, M Parashar Sandia National Lab.(SNL-NM), Albuquerque, NM (United States), 2016 | 4 | 2016 |
Asynchronous Many-Task Programming Models for Next Generation Platforms. JJ Wilke, MT Bettencourt, SW Bova, K Franko, M Gamell, R Grant, ... Sandia National Lab.(SNL-CA), Livermore, CA (United States); Sandia National …, 2015 | 2 | 2015 |