Type that can be used to set additional configuration parameters to dtFFT
| Type | Visibility | Attributes | Name | Initial | |||
|---|---|---|---|---|---|---|---|
| logical(kind=c_bool), | public | :: | enable_log |
Should dtFFT print additional information during plan creation or not. Default is false. |
|||
| logical(kind=c_bool), | public | :: | enable_z_slab |
Should dtFFT use Z-slab optimization or not. Default is true. One should consider disabling Z-slab optimization in order to resolve |
|||
| logical(kind=c_bool), | public | :: | enable_y_slab |
Should dtFFT use Y-slab optimization or not. Default is false. One should consider disabling Y-slab optimization in order to resolve |
|||
| integer(kind=c_int32_t), | public | :: | n_measure_warmup_iters |
Number of warmup iterations to execute during backend and kernel autotuning when effort level is Default is 2. |
|||
| integer(kind=c_int32_t), | public | :: | n_measure_iters |
Number of iterations to execute during backend and kernel autotuning when effort level is Default is 5. |
|||
| type(dtfft_platform_t), | public | :: | platform |
Selects platform to execute plan. Default is This option is only available when dtFFT is built with device support. Even when dtFFT is built with device support, it does not necessarily mean that all plans must be device-related. This enables a single library installation to support both host and CUDA plans. |
|||
| type(dtfft_stream_t), | public | :: | stream |
Main CUDA stream that will be used in dtFFT. This parameter is a placeholder for user to set custom stream. Stream that is actually used by dtFFT plan is returned by When user sets stream he is responsible of destroying it. Stream must not be destroyed before call to |
|||
| type(dtfft_backend_t), | public | :: | backend |
Backend that will be used by dtFFT when Default for HOST platform is Default for CUDA platform is |
|||
| type(dtfft_backend_t), | public | :: | reshape_backend |
Backend that will be used by dtFFT for data reshaping from bricks to pencils and vice versa when Default for HOST platform is Default for CUDA platform is |
|||
| logical(kind=c_bool), | public | :: | enable_datatype_backend |
Should Default is true. This option only works when |
|||
| logical(kind=c_bool), | public | :: | enable_mpi_backends |
Should MPI Backends be enabled when Default is false. The following applies only to CUDA builds. MPI Backends are disabled by default during autotuning process due to OpenMPI Bug https://github.com/open-mpi/ompi/issues/12849 It was noticed that during plan autotuning GPU memory not being freed completely. For example: 1024x1024x512 C2C, double precision, single GPU, using Z-slab optimization, with MPI backends enabled, plan autotuning will leak 8Gb GPU memory. Without Z-slab optimization, running on 4 GPUs, will leak 24Gb on each of the GPUs. One of the workarounds is to disable MPI Backends by default, which is done here. Other is to pass “–mca btl_smcuda_use_cuda_ipc 0” to |
|||
| logical(kind=c_bool), | public | :: | enable_pipelined_backends |
Should pipelined backends be enabled when Default is true. |
|||
| logical(kind=c_bool), | public | :: | enable_nccl_backends |
Should NCCL Backends be enabled when Default is true. This option is only defined when dtFFT is built with CUDA support. |
|||
| logical(kind=c_bool), | public | :: | enable_nvshmem_backends |
Should NVSHMEM Backends be enabled when Default is true. This option is only defined when dtFFT is built with CUDA support. |
|||
| logical(kind=c_bool), | public | :: | enable_kernel_autotune |
Should dtFFT try to optimize kernel launch parameters during plan creation when Default is false. Kernel optimization is always enabled for |
|||
| logical(kind=c_bool), | public | :: | enable_fourier_reshape |
Should dtFFT execute reshapes from pencils to bricks and vice versa in Fourier space during calls to Default is false. When enabled, data will be in brick layout in Fourier space, which may be useful for certain operations between forward and backward transforms. However, this requires additional data transpositions and will reduce overall FFT performance. |
Interface to create a new configuration
Creates a new configuration
| Type | Intent | Optional | Attributes | Name | ||
|---|---|---|---|---|---|---|
| logical, | intent(in), | optional | :: | enable_log |
Should dtFFT print additional information during plan creation or not. |
|
| logical, | intent(in), | optional | :: | enable_z_slab |
Should dtFFT use Z-slab optimization or not. |
|
| logical, | intent(in), | optional | :: | enable_y_slab |
Should dtFFT use Y-slab optimization or not. |
|
| integer(kind=int32), | intent(in), | optional | :: | n_measure_warmup_iters |
Number of warmup iterations for measurements |
|
| integer(kind=int32), | intent(in), | optional | :: | n_measure_iters |
Number of measurement iterations |
|
| type(dtfft_platform_t), | intent(in), | optional | :: | platform |
Selects platform to execute plan. |
|
| type(dtfft_stream_t), | intent(in), | optional | :: | stream |
Main CUDA stream that will be used in dtFFT. |
|
| type(dtfft_backend_t), | intent(in), | optional | :: | backend |
Backend that will be used by dtFFT when |
|
| type(dtfft_backend_t), | intent(in), | optional | :: | reshape_backend |
Backend that will be used by dtFFT for data reshaping from bricks to pencils and vice versa when |
|
| logical, | intent(in), | optional | :: | enable_datatype_backend |
Should |
|
| logical, | intent(in), | optional | :: | enable_mpi_backends |
Should MPI Backends be enabled when |
|
| logical, | intent(in), | optional | :: | enable_pipelined_backends |
Should pipelined backends be enabled when |
|
| logical, | intent(in), | optional | :: | enable_nccl_backends |
Should NCCL Backends be enabled when |
|
| logical, | intent(in), | optional | :: | enable_nvshmem_backends |
Should NVSHMEM Backends be enabled when |
|
| logical, | intent(in), | optional | :: | enable_kernel_autotune |
Should dtFFT try to autotune transpose/packing/unpacking kernels size during autotune process or not. |
|
| logical, | intent(in), | optional | :: | enable_fourier_reshape |
Should dtFFT execute reshapes from pencils to bricks and vice versa in Fourier space during calls to |
Constructed dtFFT config ready to be set by call to dtfft_set_config