Development of a multi-GPU Navier-Stokes solver

Documento completo qui

Jimmy Vianello

A new co-located finite-difference solver for the incompressible Navier-Stokes equations, which exploits the direction-splitting method proposed by Guermond and Minev in 2010, developed by A.Chiarini, M.Quadrio and F.Auteri, has been ported to GPU clusters in order to harness the computational power of the current supercomputers.

The development of the code was performed using CUDA Fortran and targets the new Marconi100 cluster, accelerated by NVIDIA Tesla V100 GPUs, available at CINECA. The main feature of the solver is to perform the entire time loop exclusively on the GPUs using kernels implemented ad hoc to obtain the maximum possible performance in the computing-intensive parts of the algorithm. Communication was managed through the NCCL library, optimized by NVIDIA to increase portability and scalability of multi-GPU applications.

The results obtained were compared with the CPU version and are identical to machine precision, which indicates that the two versions of the code are consistent. Finally, a scalability study was performed using a manufactured solution of the Navier-Stokes equations, integrating the results with those previously obtained on the CPU cluster Galileo (CINECA). Computational performance is outstanding.