A comprehensive performance comparison of CUDA and OpenCL J Fang, AL Varbanescu, H Sips 2011 International Conference on Parallel Processing, 216-225, 2011 | 391 | 2011 |
How well do graph-processing platforms perform? an empirical performance evaluation and analysis Y Guo, M Biczak, AL Varbanescu, A Iosup, C Martella, TL Willke 2014 IEEE 28th International Parallel and Distributed Processing Symposium …, 2014 | 131 | 2014 |
Multicore surprises: Lessons learned from optimizing Sweep3D on the Cell Broadband Engine F Petrini, G Fossum, J Fernández, AL Varbanescu, M Kistler, M Perrone 2007 IEEE International Parallel and Distributed Processing Symposium, 1-10, 2007 | 131 | 2007 |
Test-driving intel xeon phi J Fang, H Sips, L Zhang, C Xu, Y Che, AL Varbanescu Proceedings of the 5th ACM/SPEC international conference on Performance …, 2014 | 117 | 2014 |
Cross-loop optimization of arithmetic intensity for finite element local assembly F Luporini, AL Varbanescu, F Rathgeber, GT Bercea, J Ramanujam, ... ACM Transactions on Architecture and Code Optimization (TACO) 11 (4), 1-25, 2015 | 70 | 2015 |
Performance gaps between OpenMP and OpenCL for multi-core CPUs J Shen, J Fang, H Sips, AL Varbanescu 2012 41st International Conference on Parallel Processing Workshops, 116-125, 2012 | 70 | 2012 |
An empirical study of intel xeon phi J Fang, AL Varbanescu, H Sips, L Zhang, Y Che, C Xu arXiv preprint arXiv:1310.5842, 2013 | 57 | 2013 |
Performance traps in OpenCL for CPUs J Shen, J Fang, H Sips, AL Varbanescu 2013 21st Euromicro International Conference on Parallel, Distributed, and …, 2013 | 53 | 2013 |
A survey of parallel graph processing frameworks N Doekemeijer, AL Varbanescu Delft University of Technology 21, 2014 | 48 | 2014 |
An application-centric evaluation of OpenCL on multi-core CPUs J Shen, J Fang, H Sips, AL Varbanescu Parallel Computing 39 (12), 834-850, 2013 | 47 | 2013 |
Benchmarking graph-processing platforms: A vision Y Guo, AL Varbanescu, A Iosup, C Martella, TL Willke Proceedings of the 5th ACM/SPEC international conference on Performance …, 2014 | 43 | 2014 |
Evaluating multi-core platforms for HPC data-intensive kernels AS van Amesfoort, AL Varbanescu, HJ Sips, RV Van Nieuwpoort Proceedings of the 6th ACM Conference on Computing Frontiers, 207-216, 2009 | 34 | 2009 |
Workload partitioning for accelerating applications on heterogeneous platforms J Shen, AL Varbanescu, Y Lu, P Zou, H Sips IEEE Transactions on Parallel and Distributed Systems 27 (9), 2766-2780, 2015 | 33 | 2015 |
Glinda: a framework for accelerating imbalanced applications on heterogeneous platforms J Shen, AL Varbanescu, H Sips, M Arntzen, DG Simons Proceedings of the ACM International Conference on Computing Frontiers, 1-10, 2013 | 33 | 2013 |
Benchmarking intel xeon phi to guide kernel design J Fang, AL Varbanescu, H Sips, L Zhang, Y Che, C Xu Delft University of Technology Parallel and Distributed Systems Report …, 2013 | 28 | 2013 |
The landscape of GPGPU performance modeling tools S Madougou, A Varbanescu, C de Laat, R van Nieuwpoort Parallel Computing 56, 18-33, 2016 | 27 | 2016 |
An empirical performance evaluation of gpu-enabled graph-processing systems Y Guo, AL Varbanescu, A Iosup, D Epema 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid …, 2015 | 26 | 2015 |
Improving performance by matching imbalanced workloads with heterogeneous platforms J Shen, AL Varbanescu, P Zou, Y Lu, H Sips Proceedings of the 28th ACM international conference on Supercomputing, 241-250, 2014 | 26 | 2014 |
A polyphase filter for GPUs and multi-core processors K van der Veldt, R van Nieuwpoort, AL Varbanescu, C Jesshope Proceedings of the 2012 workshop on High-Performance Computing for Astronomy …, 2012 | 24 | 2012 |
Programming multicore and many-core computing systems S Pllana, F Xhafa John Wiley & Sons, 2017 | 22 | 2017 |