dtfft_config_t Derived Type

type, public, bind(C) :: dtfft_config_t

Type that can be used to set additional configuration parameters to dtFFT


Inherits

type~~dtfft_config_t~~InheritsGraph type~dtfft_config_t dtfft_config_t type~dtfft_backend_t dtfft_backend_t type~dtfft_config_t->type~dtfft_backend_t backend, reshape_backend type~dtfft_platform_t dtfft_platform_t type~dtfft_config_t->type~dtfft_platform_t platform type~dtfft_stream_t dtfft_stream_t type~dtfft_config_t->type~dtfft_stream_t stream c_ptr c_ptr type~dtfft_stream_t->c_ptr stream

Components

Type Visibility Attributes Name Initial
logical(kind=c_bool), public :: enable_log

Should dtFFT print additional information during plan creation or not.

Default is false.

logical(kind=c_bool), public :: enable_z_slab

Should dtFFT use Z-slab optimization or not.

Default is true.

One should consider disabling Z-slab optimization in order to resolve DTFFT_ERROR_VKFFT_R2R_2D_PLAN error OR when underlying FFT implementation of 2D plan is too slow. In all other cases it is considered that Z-slab is always faster, since it reduces number of data transpositions.

logical(kind=c_bool), public :: enable_y_slab

Should dtFFT use Y-slab optimization or not.

Default is false.

One should consider disabling Y-slab optimization in order to resolve DTFFT_ERROR_VKFFT_R2R_2D_PLAN error OR when underlying FFT implementation of 2D plan is too slow. In all other cases it is considered that Y-slab is always faster, since it reduces number of data transpositions.

integer(kind=c_int32_t), public :: n_measure_warmup_iters

Number of warmup iterations to execute during backend and kernel autotuning when effort level is DTFFT_MEASURE or higher.

Default is 2.

integer(kind=c_int32_t), public :: n_measure_iters

Number of iterations to execute during backend and kernel autotuning when effort level is DTFFT_MEASURE or higher.

Default is 5.

type(dtfft_platform_t), public :: platform

Selects platform to execute plan.

Default is DTFFT_PLATFORM_HOST.

This option is only available when dtFFT is built with device support. Even when dtFFT is built with device support, it does not necessarily mean that all plans must be device-related. This enables a single library installation to support both host and CUDA plans.

type(dtfft_stream_t), public :: stream

Main CUDA stream that will be used in dtFFT.

This parameter is a placeholder for user to set custom stream.

Stream that is actually used by dtFFT plan is returned by plan%get_stream function.

When user sets stream he is responsible of destroying it.

Stream must not be destroyed before call to plan%destroy.

type(dtfft_backend_t), public :: backend

Backend that will be used by dtFFT when effort is DTFFT_ESTIMATE or DTFFT_MEASURE.

Default for HOST platform is DTFFT_BACKEND_MPI_DATATYPE.

Default for CUDA platform is DTFFT_BACKEND_NCCL if NCCL is enabled, otherwise DTFFT_BACKEND_MPI_P2P.

type(dtfft_backend_t), public :: reshape_backend

Backend that will be used by dtFFT for data reshaping from bricks to pencils and vice versa when effort is DTFFT_ESTIMATE or DTFFT_MEASURE.

Default for HOST platform is DTFFT_BACKEND_MPI_DATATYPE.

Default for CUDA platform is DTFFT_BACKEND_NCCL if NCCL is enabled, otherwise DTFFT_BACKEND_MPI_P2P.

logical(kind=c_bool), public :: enable_datatype_backend

Should DTFFT_BACKEND_MPI_DATATYPE be considered for autotuning when effort is DTFFT_PATIENT or DTFFT_EXHAUSTIVE.

Default is true.

This option only works when platform is DTFFT_PLATFORM_HOST. When platform is DTFFT_PLATFORM_CUDA, DTFFT_BACKEND_MPI_DATATYPE is always disabled during autotuning.

logical(kind=c_bool), public :: enable_mpi_backends

Should MPI Backends be enabled when effort is DTFFT_PATIENT or DTFFT_EXHAUSTIVE.

Default is false.

The following applies only to CUDA builds. MPI Backends are disabled by default during autotuning process due to OpenMPI Bug https://github.com/open-mpi/ompi/issues/12849 It was noticed that during plan autotuning GPU memory not being freed completely. For example: 1024x1024x512 C2C, double precision, single GPU, using Z-slab optimization, with MPI backends enabled, plan autotuning will leak 8Gb GPU memory. Without Z-slab optimization, running on 4 GPUs, will leak 24Gb on each of the GPUs.

One of the workarounds is to disable MPI Backends by default, which is done here.

Other is to pass “–mca btl_smcuda_use_cuda_ipc 0” to mpiexec, but it was noticed that disabling CUDA IPC seriously affects overall performance of MPI algorithms

logical(kind=c_bool), public :: enable_pipelined_backends

Should pipelined backends be enabled when effort is DTFFT_PATIENT or DTFFT_EXHAUSTIVE.

Default is true.

logical(kind=c_bool), public :: enable_nccl_backends

Should NCCL Backends be enabled when effort is DTFFT_PATIENT or DTFFT_EXHAUSTIVE.

Default is true.

This option is only defined when dtFFT is built with CUDA support.

logical(kind=c_bool), public :: enable_nvshmem_backends

Should NVSHMEM Backends be enabled when effort is DTFFT_PATIENT or DTFFT_EXHAUSTIVE.

Default is true.

This option is only defined when dtFFT is built with CUDA support.

logical(kind=c_bool), public :: enable_kernel_autotune

Should dtFFT try to optimize kernel launch parameters during plan creation when effort is below DTFFT_EXHAUSTIVE.

Default is false.

Kernel optimization is always enabled for DTFFT_EXHAUSTIVE effort level. Setting this option to true enables kernel optimization for lower effort levels (DTFFT_ESTIMATE, DTFFT_MEASURE, DTFFT_PATIENT). This may increase plan creation time but can improve runtime performance. Since kernel optimization is performed without data transfers, the time increase is usually minimal.

logical(kind=c_bool), public :: enable_fourier_reshape

Should dtFFT execute reshapes from pencils to bricks and vice versa in Fourier space during calls to execute.

Default is false.

When enabled, data will be in brick layout in Fourier space, which may be useful for certain operations between forward and backward transforms. However, this requires additional data transpositions and will reduce overall FFT performance.


Constructor

public interface dtfft_config_t

Interface to create a new configuration

  • private pure function config_constructor(enable_log, enable_z_slab, enable_y_slab, n_measure_warmup_iters, n_measure_iters, platform, stream, backend, reshape_backend, enable_datatype_backend, enable_mpi_backends, enable_pipelined_backends, enable_nccl_backends, enable_nvshmem_backends, enable_kernel_autotune, enable_fourier_reshape) result(config)

    Creates a new configuration

    Arguments

    Type IntentOptional Attributes Name
    logical, intent(in), optional :: enable_log

    Should dtFFT print additional information during plan creation or not.

    logical, intent(in), optional :: enable_z_slab

    Should dtFFT use Z-slab optimization or not.

    logical, intent(in), optional :: enable_y_slab

    Should dtFFT use Y-slab optimization or not.

    integer(kind=int32), intent(in), optional :: n_measure_warmup_iters

    Number of warmup iterations for measurements

    integer(kind=int32), intent(in), optional :: n_measure_iters

    Number of measurement iterations

    type(dtfft_platform_t), intent(in), optional :: platform

    Selects platform to execute plan.

    type(dtfft_stream_t), intent(in), optional :: stream

    Main CUDA stream that will be used in dtFFT.

    type(dtfft_backend_t), intent(in), optional :: backend

    Backend that will be used by dtFFT when effort is DTFFT_ESTIMATE or DTFFT_MEASURE.

    type(dtfft_backend_t), intent(in), optional :: reshape_backend

    Backend that will be used by dtFFT for data reshaping from bricks to pencils and vice versa when effort is DTFFT_ESTIMATE or DTFFT_MEASURE.

    logical, intent(in), optional :: enable_datatype_backend

    Should DTFFT_BACKEND_MPI_DATATYPE be enabled when effort is DTFFT_PATIENT or not.

    logical, intent(in), optional :: enable_mpi_backends

    Should MPI Backends be enabled when effort is DTFFT_PATIENT or not.

    logical, intent(in), optional :: enable_pipelined_backends

    Should pipelined backends be enabled when effort is DTFFT_PATIENT or not.

    logical, intent(in), optional :: enable_nccl_backends

    Should NCCL Backends be enabled when effort is DTFFT_PATIENT or not.

    logical, intent(in), optional :: enable_nvshmem_backends

    Should NVSHMEM Backends be enabled when effort is DTFFT_PATIENT or not.

    logical, intent(in), optional :: enable_kernel_autotune

    Should dtFFT try to autotune transpose/packing/unpacking kernels size during autotune process or not.

    logical, intent(in), optional :: enable_fourier_reshape

    Should dtFFT execute reshapes from pencils to bricks and vice versa in Fourier space during calls to execute or not.

    Return Value type(dtfft_config_t)

    Constructed dtFFT config ready to be set by call to dtfft_set_config