Direct Numerical Simulation of turbulent channel flow on Intel Xeon PHI (KNL) architecture

Documento completo qui

Luca Guastoni

This work explores the optimization potential of an existing code for the Direct Numerical Simulation (DNS) of a turbulent plane channel flow. Two implementations in different programming languages (CPL and Fortran) have been adapted to Intel Xeon Phi processor (Knights Landing), a platform with many integrated cores (MIC) and specialized vector hardware.

The purpose of this study is to provide an example of the performance attainable on this processor, using a code that was originally conceived for a different architecture. The code written for general-purpose computers can be run without recompilation on Intel Xeon Phi, however adjustments are required to achieve a satisfactory performance and to fully exploit the hardware resources of the processor. To this aim, different optimization techniques, such as the addition of compiler directives and the modification of the data layout in memory, are described and used.

The single-threaded performance of the two implementations is optimized and the improvement is measured. The efficiency of the different parallelization methods in the code is discussed and a second layer of parallelism is implemented in the Fortran version. Finally, the code is tested also on Intel Xeon E5-2697 v4 (Broadwell) processor, providing a frame of reference to evaluate the performance of Knights Landing against multi-core architectures that are used in supercomputing centers, where KNL is gaining acceptance.