Education in HPC and Data Science at Simula Research Lab and UiO In SUPERDATA Workshop on curriculum development, Yunan, China., 2018.
Heterogeneous Computing: Programming, Performance and Applications In CoSaS 2018 Symposium, Erlangen, Germany., 2018.
Memory Bandwidth Contention: Communication vs Computation Tradeoffs in Supercomputers with Multicore Architectures In International Conference on Parallel and Distributed Systems (ICPADS), Edited by Y. Zou. Singapore: ACM/IEEE, 2018. badwidthcontention.pdf (702.18 KB)
Quantifying data traffic of sparse matrix-vector multiplication in a multi-level memory hierarchy. London, UK, 2018.
Towards Detailed Organ-Scale Simulations in Cardiac Electrophysiology. International Symposium on Computational Science at Scale (CoSaS), Erlangen, Germany, 2018. poster2.pdf (6.34 MB)
Unstructured mesh partitioning in the presence of strong coefficient heterogeneity In PDESoft 2018 Conference, Bergen, Norway., 2018.
Accelerated high-performance computing for computational cardiac electrophysiology In The University of Tokyo, Tokyo, Japan., 2017.
Automated Translation of MATLAB Code to C++ with Performance and Traceability In The Eleventh International Conference on Advanced Engineering Computing and Applications in Sciences (ADVCOMP 2017), Edited by C. Rückemann and D. Vucinic. International Academy, Research and Industry Association (IARIA), 2017.
Heterogeneous Manycore Simulations in Cardiac Electrophysiology In Tenth International Workshop on Programmability and Architectures for Heterogeneous Multicores (MULTIPROG-2017), Stockholm, Sweden., 2017.
Porting Tissue-Scale Cardiac Simulations to the Knights Landing Platform In International Conference on High Performance Computing, Edited by J. Kunkel. Lecture Notes in Computer Science, Springer, 2017.
Accelerating Detailed Tissue-Scale 3D Cardiac Simulations Using Heterogeneous CPU-Xeon Phi Computing." International Journal of Parallel Programming (2016): 1-23. ijpp-2016-2.pdf (2.66 MB)"
Enabling Tissue-Scale Cardiac Simulations Using Heterogeneous Computing on Tianhe-2 In IEEE 22nd International Conference on Parallel and Distributed Systems (ICPADS), Edited by C. Zhang. ACM/IEEE, 2016. langguth_etal_icpads2016.pdf (1.29 MB)
Matlab2cpp: A Matlab-to-C++ code translator In IEEE 2016 11th System of Systems Engineering Conference (SoSE), Edited by H. P. Dahle. IEEE, 2016.
Panda: A Compiler Framework for Concurrent CPU+GPU Execution of 3D Stencil Computations on GPU-accelerated Supercomputers." International Journal of Parallel Programming (2016). ijpp-2016-1.pdf (1.48 MB)"
On the Performance and Energy Efficiency of the PGAS Programming Model on Multicore Architectures In High Performance Computing & Simulation (2016) - International Workshop on Optimization of Energy Efficient HPC & Distributed Systems, Edited by P. H. Ha. ACM IEEE, 2016. camerareadyversion_v3.0.pdf (822.3 KB)
Solving 3D Time-Fractional Diffusion Equations by High-Performance Parallel Computing." Fractional Calculus and Applied Analysis 19, no. 1 (2016): 140-160."
An Analytical GPU Performance Model for 3D Stencil Computations from the Angle of Data Traffic." The Journal of Supercomputing 71, no. 7 (2015): 2433-2453. su_etal_js2015.pdf (1.22 MB)"
Communication-Hiding Programming for Clusters with Multi-Coprocessor Nodes." Concurrency and Computation: Practice and Experience 27, no. 16 (2015): 4172-4185. cpe3507-online-version.pdf (1.83 MB)"
CPU+GPU Programming of Stencil Computations for Resource-Efficient Use of GPU Clusters In IEEE 18th International Conference on Computational Science and Engineering. IEEE Computer Society, 2015. 8297a017.pdf (586.3 KB)
Dysfunctional Sarcoplasmic Reticulum Ca2+ Release Underlies Arrhythmogenic Triggers in Catecholaminergic Polymorphic Ventricular Tachycardia: A Simulation Study in a Human Ventricular Myocyte Model In Gordons Research Conference on Cardiac Arrhythmia. Lucca, Italy: Gordons Research Conference on Cardiac Arrhythmia, 2015. cpvt_poster_grc_2015.pptx (1.48 MB)
Enabling a Uniform OpenCL Device View for Heterogeneous Platforms." IEICE Transactions on Information and Systems E98-D, no. 4 (2015): 812-823."
Multi-GPU Implementations of Parallel 3D Sweeping Algorithms with Application to Geological Folding In ICCS 2015. Elsevier, 2015. iccs2015.pdf (856.3 KB)
Parallel Computing." In Encyclopedia of Applied and Computational Mathematics, edited by B. Engquist, 1129-1132. Springer Berlin Heidelberg, 2015."
Parallel performance modeling of irregular applications in cell-centered finite volume methods over unstructured tetrahedral meshes." Journal of Parallel and Distributed Computing 76 (2015): 120-131. langguth_etal_jpdc2015.pdf (2.69 MB)"
Is PGAS ready for the challenge of energy efficiency? A study with the NAS benchmark.. Tromsø: UiT, 2015. article.pdf (617.79 KB)