Contents

📒 Wang 2014

Fast acceleration of 2D wave propagation simulations using modern computational accelerators1

Sciwheel

Introduction

OpenACC vs OpenCL vs CUDA

Methods

Cardiac Wave Propagation Model

  • CMC: Beeler-Reuter model
  • Tissue: reaction diffusion $$ C_{m} \frac{\partial V_{m}}{\partial t}=\nabla \cdot D \nabla V_{m}-I_{ion} $$
  • no-flux boundary conditions and finite difference integration https://journals.plos.org/plosone/article/figure/image?size=large&id=10.1371/journal.pone.0086484.g001 https://journals.plos.org/plosone/article/figure/image?size=large&id=10.1371/journal.pone.0086484.g002 https://journals.plos.org/plosone/article/figure/image?size=large&id=10.1371/journal.pone.0086484.g003

CUDA and OpenCL Implementations

  • CUDA: NVIDIA GPUs
  • OpenCL: Intel MIC accelerator card in addition to NVIDIA GPUs

OpenACC Implementation

https://journals.plos.org/plosone/article/figure/image?size=large&id=10.1371/journal.pone.0086484.g004

OpenMP Implementation

  • CPU code

Results

https://journals.plos.org/plosone/article/figure/image?size=large&id=10.1371/journal.pone.0086484.g005
Fermi-architecture Tesla C2050 GPU and the Kepler architecture Tesla K20 GPU
https://journals.plos.org/plosone/article/figure/image?size=large&id=10.1371/journal.pone.0086484.g006
hand-written GPU code (Man_CUDA, Man_OpenCL) vs OpenACC targeting CUDA and OpenCL (ACC_CUDA, ACC_OpenCL)
https://journals.plos.org/plosone/article/figure/image?size=large&id=10.1371/journal.pone.0086484.g007
Fermi GPU (NVIDIA C2050) and Kepler GPU (NVIDIA K20)
https://journals.plos.org/plosone/article/figure/image?size=large&id=10.1371/journal.pone.0086484.g008
Xeon Phi coprocessor

Summary of Effective Optimizations for Different Implementations

  • Eliminating atomic operations (sequential nature)
  • Coalescing memory accesses (decreasing # of memory transactions)

Parallel Programming Tools Comparison

https://journals.plos.org/plosone/article/figure/image?size=large&id=10.1371/journal.pone.0086484.t001

  • the OpenACC implementation, taking the minimum amount of effort to program, also achieved very good speedups on GPUs and the same portability as OpenCL implementation did

Reference


  1. Wang W, Xu L, Cavazos J, Huang HH, Kay M. Fast acceleration of 2D wave propagation simulations using modern computational accelerators. PLoS ONE 2014;9(1):e86484. doi:10.1371/journal.pone.0086484. PMC3907428 ↩︎