Abstract
We present the graphics processing unit (GPU) porting of a Monte Carlo particle-following code, called ASCOT5. The code solves the distribution function of minority species in fusion plasmas. Originally developed with an MPI-OpenMP hybrid parallelism taking full advantage of single instruction, multiple data operations, the code has been ported to GPU architecture using the OpenACC programming model. Subsequently, modifications were made to implement three distinct algorithmic strategies: history-based, event-based, and event-based-packing. In the first implementation, each GPU processing unit deals with the entire history of one or more particles, while the event-based algorithm operates on the principle of executing a single low-level event type at a time for all particles still alive (with or without packing particles). Performance results on NVIDIA GPUs are presented to showcase the effectiveness and efficiency of the code adaptations for GPU execution. These results provide insights into the comparative performance of the implemented approaches on the specified hardware architecture. Portability across other architectures such as INTEL and AMD GPUs with OpenMP Offload is also presented.
| Original language | English |
|---|---|
| Article number | 105020 |
| Journal | Plasma Physics and Controlled Fusion |
| Volume | 67 |
| Issue number | 10 |
| DOIs | |
| Publication status | Published - Oct 2025 |
| MoE publication type | A1 Journal article-refereed |
Funding
This work, supported in part by the Swiss National Science Foundation, has been carried out within the framework of the EUROfusion Consortium—Theory and Advanced Simulation Coordination (E-TASC), funded by the European Union via the Euratom Research and Training Programme (Grant Agreement No 101052200—EUROfusion).
Keywords
- ampere
- GPU
- Grace
- Hopper
- MonteCarlo
- OpenACC
- OpenMP