Procedures

ProcedureLocationProcedure TypeDescription
add_line dtfft_nvrtc_kernel Subroutine

Adds new line to CUDA code

alloc_and_set_aux dtfft_transpose_plan_cuda Function

Allocates auxiliary memory according to the backend and sets it to the plans

alloc_fft_plans dtfft_plan Subroutine

Allocates abstract_executor with required FFT class and populates fft_mapping with similar FFT ids

alloc_mem dtfft_abstract_transpose_plan Subroutine

Allocates memory based on backend

astring_f2c dtfft_utils Subroutine

Convert Fortran string to C allocatable string

autotune_grid dtfft_transpose_plan_cuda Subroutine

Creates cartesian grid and runs various backends on it. Can return best backend and execution time

autotune_grid dtfft_transpose_plan_host Subroutine

Creates cartesian communicator and executes various datatypes on it

autotune_grid_decomposition dtfft_transpose_plan_cuda Subroutine

Runs through all possible grid decompositions and selects the best one based on the lowest average execution time

autotune_grid_decomposition dtfft_transpose_plan_host Subroutine

Runs through all possible grid decompositions and selects the best one based on the lowest average execution time

autotune_mpi_datatypes dtfft_transpose_plan_host Subroutine
autotune_transpose_id dtfft_transpose_plan_host Function

Creates forward and backward transpose plans bases on source and target data distributing, executes them DTFFT_MEASURE_ITERS times ( 4 * DTFFT_MEASURE_ITERS iterations total )

Read more…
check_aux dtfft_plan Subroutine

Checks if aux buffer was passed by user and if not will allocate one internally

check_create_args dtfft_plan Function

Check arguments provided by user and sets private variables

check_device_pointers dtfft_plan Function

Checks if device pointers are provided by user

clean_unused_cache dtfft_nvrtc_kernel Subroutine

Removes unused modules from cuda context

Comm_f2c dtfft_utils Interface

Converts Fortran communicator to C

compile_and_cache dtfft_nvrtc_kernel Function

Compiles kernel and caches it. Returns compiled kernel.

config_constructor dtfft_config Function

Creates a new configuration

count_unique dtfft_utils Function

Count the number of unique elements in the array

create dtfft_executor_cufft_m Subroutine

Creates FFT plan via cuFFT Interface

create dtfft_nvrtc_kernel Subroutine

Creates kernel

create dtfft_executor_mkl_m Subroutine

Creates FFT plan via MKL DFTI Interface

create dtfft_abstract_transpose_plan Function

Creates transposition plans

create dtfft_abstract_executor Function

Creates FFT plan

create dtfft_abstract_backend Subroutine

Creates Abstract GPU Backend

create dtfft_transpose_handle_cuda Subroutine

Creates CUDA Transpose Handle

create dtfft_pencil Subroutine

Creates pencil

create dtfft_executor_fftw_m Subroutine

Creates FFT plan via FFTW3 Interface

create dtfft_backend_cufftmp_m Subroutine

Creates cuFFTMp GPU Backend

create dtfft_executor_vkfft_m Subroutine

Creates FFT plan via vkFFT Interface

create dtfft_transpose_handle_host Subroutine

Creates transpose_handle_host class

create_c2c dtfft_plan Subroutine

C2C Plan Constructor

create_c2c_internal dtfft_plan Function

Creates plan for both C2C and R2C

create_cart_comm dtfft_abstract_transpose_plan Subroutine

Creates cartesian communicator

create_cuda dtfft_transpose_plan_cuda Function

Creates CUDA transpose plan

create_data_handle dtfft_transpose_handle_cuda Subroutine

Creates handle

create_device_pointer dtfft_nvrtc_kernel Subroutine

Allocates memory on a device and copies values to it.

create_handle dtfft_transpose_handle_host Subroutine

Creates transposition handle

create_helper dtfft_abstract_backend Subroutine

Creates helper

create_helper dtfft_backend_mpi Subroutine

Creates MPI helper

create_mpi dtfft_backend_mpi Subroutine

Creates MPI backend

create_nccl dtfft_backend_nccl_m Subroutine

Creates NCCL backend

create_nvtx_domain dtfft_interface_nvtx Subroutine

Creates a new NVTX domain

create_private dtfft_plan Function

Creates core

create_private dtfft_transpose_plan_host Function

Creates transposition plans

create_r2c dtfft_plan Subroutine

R2C Generic Plan Constructor

create_r2r dtfft_plan Subroutine

R2R Plan Constructor

create_transpose_2d dtfft_transpose_handle_host Subroutine

Creates two-dimensional transposition datatypes

create_transpose_XY dtfft_transpose_handle_host Subroutine

Creates three-dimensional X –> Y, Y –> X transposition datatypes

create_transpose_XZ dtfft_transpose_handle_host Subroutine

Creates three-dimensional X –> Z transposition datatypes Can only be used with 3D slab decomposition when slabs are distributed in Z direction

create_transpose_YZ dtfft_transpose_handle_host Subroutine

Creates three-dimensional Y –> Z, Z –> Y transposition datatypes

create_transpose_ZX dtfft_transpose_handle_host Subroutine

Creates three-dimensional Z –> X transposition datatypes Can only be used with 3D slab decomposition when slabs are distributed in Z direction

cudaDeviceSynchronize dtfft_interface_cuda_runtime Interface

Synchronizes the device, blocking until all preceding tasks in all streams have completed.

cudaEventCreate dtfft_interface_cuda_runtime Interface

Creates an event.

cudaEventCreateWithFlags dtfft_interface_cuda_runtime Interface

Creates an event with the specified flags.

cudaEventDestroy dtfft_interface_cuda_runtime Interface

Destroys an event.

cudaEventElapsedTime dtfft_interface_cuda_runtime Interface

Computes the elapsed time between two events.

cudaEventRecord dtfft_interface_cuda_runtime Interface

Records an event in a stream.

cudaEventSynchronize dtfft_interface_cuda_runtime Interface

Waits for an event to complete.

cudaFree dtfft_interface_cuda_runtime Interface

Frees memory on the device.

cudaGetDevice dtfft_interface_cuda_runtime Interface

Returns the current device.

cudaGetDeviceCount dtfft_interface_cuda_runtime Interface

Returns the number of available devices.

cudaGetErrorString dtfft_interface_cuda_runtime Function

Helper function that returns a string describing the given nvrtcResult code If the error code is not recognized, “unrecognized error code” is returned.

cudaGetErrorString_c dtfft_interface_cuda_runtime Interface

Returns the string representation of an error code.

cudaMalloc dtfft_interface_cuda_runtime Interface

Allocates memory on the device.

cudaMemcpy dtfft_interface_cuda_runtime Interface

Copies data synchronously between host and device.

cudaMemcpyAsync dtfft_interface_cuda_runtime Interface

Copies data asynchronously between host and device.

cudaMemGetInfo dtfft_interface_cuda_runtime Interface

Returns the amount of free and total memory on the device.

cudaMemset dtfft_interface_cuda_runtime Interface

Initializes or sets device memory to a value.

cudaSetDevice dtfft_interface_cuda_runtime Interface

Sets the current device.

cudaStreamCreate dtfft_interface_cuda_runtime Interface

Creates an asynchronous stream.

cudaStreamDestroy dtfft_interface_cuda_runtime Interface

Destroys an asynchronous stream.

cudaStreamQuery dtfft_interface_cuda_runtime Interface

Queries an asynchronous stream for completion status.

cudaStreamSynchronize dtfft_interface_cuda_runtime Interface

Waits for stream tasks to complete.

cudaStreamWaitEvent dtfft_interface_cuda_runtime Interface

Makes a stream wait on an event.

cufftDestroy dtfft_interface_cufft Interface

Frees all GPU resources associated with a cuFFT plan and destroys the internal plan data structure.

cufftGetErrorString dtfft_interface_cufft Function

Returns a string representation of the cuFFT error code.

cufftMpAttachReshapeComm dtfft_interface_cufft Interface

Attaches a communication handle to a reshape. This function is not collective.

cufftMpCreateReshape dtfft_interface_cufft Interface

Initializes a reshape handle for future use. This function is not collective.

cufftMpDestroyReshape dtfft_interface_cufft Interface

Destroys a reshape and all its associated data.

cufftMpExecReshapeAsync dtfft_interface_cufft Interface

Executes the reshape, redistributing data_in into data_out using the workspace in workspace.

cufftMpGetReshapeSize dtfft_interface_cufft Interface

Returns the amount (in bytes) of workspace required to execute the handle.

cufftMpMakeReshape dtfft_interface_cufft Interface

Creates a reshape intended to re-distribute a global array of 3D data.

cufftPlanMany dtfft_interface_cufft Interface

Creates a FFT plan configuration of dimension rank, with sizes specified in the array n.

cufftSetStream dtfft_interface_cufft Interface

Associates a CUDA stream with a cuFFT plan.

cufftXtExec dtfft_interface_cufft Interface

Executes any cuFFT transform regardless of precision and type. In case of complex-to-real and real-to-complex transforms, the direction parameter is ignored.

cuLaunchKernel dtfft_interface_cuda Function

Launches a CUDA function CUfunction or a CUDA kernel CUkernel.

destoy_helper dtfft_backend_mpi Subroutine

Destroys MPI helper

destroy dtfft_plan Subroutine

Destroys plan, frees all memory

destroy dtfft_executor_cufft_m Subroutine

Destroys cuFFT plan

destroy dtfft_nvrtc_kernel Subroutine

Destroys kernel

destroy dtfft_executor_mkl_m Subroutine

Destroys MKL plan

destroy dtfft_abstract_executor Subroutine

Destroys plan

destroy dtfft_abstract_backend Subroutine

Destroys Abstract GPU Backend

destroy dtfft_transpose_handle_cuda Subroutine

Destroys CUDA Transpose Handle

destroy dtfft_pencil Subroutine

Destroys pencil

destroy dtfft_executor_fftw_m Subroutine

Destroys FFTW3 plan

destroy dtfft_backend_cufftmp_m Subroutine

Destroys cuFFTMp GPU Backend

destroy dtfft_executor_vkfft_m Subroutine

Destroys vkFFT plan

destroy dtfft_transpose_plan_host Subroutine

Destroys transposition plans

destroy dtfft_transpose_handle_host Subroutine

Destroys transpose_handle_host class

destroy_code dtfft_nvrtc_kernel Subroutine

Frees all memory

destroy_cuda dtfft_transpose_plan_cuda Subroutine

Destroys transposition plans

destroy_data_handle dtfft_transpose_handle_cuda Subroutine

Destroys handle

destroy_handle dtfft_transpose_handle_host Subroutine

Destroys transposition handle

destroy_helper dtfft_abstract_backend Subroutine

Destroys helper

destroy_mpi dtfft_backend_mpi Subroutine

Destroys MPI backend

destroy_nccl dtfft_backend_nccl_m Subroutine

Destroys NCCL backend

destroy_pencil_t dtfft_pencil Subroutine

Destroys pencil

destroy_stream dtfft_config Subroutine

Destroy the default stream if it was created

destroy_strings dtfft_utils Subroutine

Destroys array of string objects

DftiErrorMessage dtfft_interface_mkl_m Function

Generates an error message.

DftiErrorMessage_c dtfft_interface_mkl_m Interface

Generates an error message.

dl_error dtfft_utils Subroutine

Writes error message to the error unit

dlclose dtfft_utils Interface

Close a dynamic library or bundle

dlerror dtfft_utils Interface

Get diagnostic information

dlopen dtfft_utils Interface

Load and link a dynamic library or bundle

dlsym dtfft_utils Interface

Get address of a symbol

double_to_str dtfft_utils Function

Convert double to string

dtfft_config_t dtfft_config Interface

Interface to create a new configuration

dtfft_create_config dtfft_config Subroutine

Creates a new configuration with default values.

Read more…
dtfft_create_plan_c2c_c dtfft_api Function

Creates C2C dtFFT Plan, allocates all structures and prepares FFT, C/C++ interface

dtfft_create_plan_r2r_c dtfft_api Function

Creates R2R dtFFT Plan, allocates all structures and prepares FFT, C/C++/Python interface

dtfft_destroy_c dtfft_api Function

Destroys dtFFT Plan, C/C++ interface

dtfft_execute_c dtfft_api Function

Executes dtFFT Plan, C/C++ interface. aux can be NULL.

dtfft_get_alloc_bytes_c dtfft_api Function

Returns minimum number of bytes required to execute plan, C/C++ interface

dtfft_get_alloc_size_c dtfft_api Function

Returns minimum number of bytes to be allocated for in and out buffers, C/C++ interface

dtfft_get_backend_c dtfft_api Function

Returns selected dtfft_backend_t during autotuning

dtfft_get_backend_string dtfft_parameters Function

Gets the string description of a GPU backend

dtfft_get_backend_string_c dtfft_api Subroutine

Returns string representation of dtfft_backend_t

dtfft_get_cuda_stream dtfft_parameters Function

Returns the CUDA stream from dtfft_stream_t

dtfft_get_element_size_c dtfft_api Function

Returns size of element in bytes, C/C++ interface

dtfft_get_error_string dtfft_parameters Function

Gets the string description of an error code

dtfft_get_error_string_c dtfft_api Subroutine

Returns an explaination of error_code that could have been previously returned by one of dtFFT API calls, C/C++ interface

dtfft_get_local_sizes_c dtfft_api Function

Returns local sizes, counts in real and Fourier spaces and number of elements to be allocated for in and out buffers, C/C++ interface.

dtfft_get_pencil_c dtfft_api Function

Returns pencil decomposition info, C/C++ interface

dtfft_get_platform_c dtfft_api Function

Returns selected dtfft_platform_t during autotuning

dtfft_get_stream_c dtfft_api Function

Returns Stream associated with plan

dtfft_get_version dtfft_parameters Interface

Get dtFFT version

dtfft_get_version_current dtfft_parameters Function

Returns the current version code

dtfft_get_version_required dtfft_parameters Function

Returns the version code required by the user

dtfft_get_z_slab_enabled_c dtfft_api Function

Checks if dtFFT Plan is using Z-slab optimization

dtfft_mem_alloc_c dtfft_api Function

Allocates memory for dtFFT Plan, C/C++ interface

dtfft_mem_free_c dtfft_api Function

Frees memory for dtFFT Plan, C/C++ interface

dtfft_report_c dtfft_api Function

Reports dtFFT Plan, C/C++ interface

dtfft_set_config dtfft_config Subroutine

Sets configuration parameters

dtfft_set_config_c dtfft_api Function

Sets dtFFT configuration, C/C++ interface

dtfft_stream_t dtfft_parameters Interface

Creates dtfft_stream_t from integer(cuda_stream_kind)

dtfft_transpose_c dtfft_api Function

Executes single transposition, C/C++ interface.

dynamic_load dtfft_utils Function

Dynamically loads library and its symbols

effort_eq dtfft_parameters Function
effort_ne dtfft_parameters Function
execute dtfft_plan Subroutine

Executes plan

execute dtfft_executor_cufft_m Subroutine

Executes cuFFT plan

execute dtfft_nvrtc_kernel Subroutine

Executes kernel on stream

execute dtfft_executor_mkl_m Subroutine

Executes MKL plan

execute dtfft_abstract_transpose_plan Subroutine

Executes single transposition

execute dtfft_abstract_executor Subroutine

Executes plan

execute dtfft_abstract_backend Subroutine

Executes GPU Backend

execute dtfft_transpose_handle_cuda Subroutine

Executes transpose - exchange - unpack

execute dtfft_executor_fftw_m Subroutine

Executes FFTW3 plan

execute dtfft_backend_cufftmp_m Subroutine

Executes cuFFTMp GPU Backend

execute dtfft_executor_vkfft_m Subroutine

Executes vkFFT plan

execute dtfft_transpose_handle_host Subroutine

Executes transposition

execute_cuda dtfft_transpose_plan_cuda Subroutine

Executes single transposition

execute_mpi dtfft_backend_mpi Subroutine

Executes MPI backend

execute_nccl dtfft_backend_nccl_m Subroutine

Executes NCCL backend

execute_private dtfft_plan Subroutine

Executes plan with specified auxiliary buffer

execute_private dtfft_transpose_plan_host Subroutine

Executes single transposition

execute_ptr dtfft_plan Subroutine

Executes plan using type(c_ptr) pointers instead of buffers

execute_type_eq dtfft_parameters Function
execute_type_ne dtfft_parameters Function
executor_eq dtfft_parameters Function
executor_ne dtfft_parameters Function
fftw_execute_dft dtfft_interface_fftw_m Interface
fftw_execute_dft_c2r dtfft_interface_fftw_m Interface
fftw_execute_dft_r2c dtfft_interface_fftw_m Interface
fftw_execute_r2r dtfft_interface_fftw_m Interface
fftw_plan_many_dft dtfft_interface_fftw_m Interface
fftw_plan_many_dft_c2r dtfft_interface_fftw_m Interface
fftw_plan_many_dft_r2c dtfft_interface_fftw_m Interface
fftw_plan_many_r2r dtfft_interface_fftw_m Interface
fftwf_execute_dft dtfft_interface_fftw_m Interface
fftwf_execute_dft_c2r dtfft_interface_fftw_m Interface
fftwf_execute_dft_r2c dtfft_interface_fftw_m Interface
fftwf_execute_r2r dtfft_interface_fftw_m Interface
fftwf_plan_many_dft dtfft_interface_fftw_m Interface
fftwf_plan_many_dft_c2r dtfft_interface_fftw_m Interface
fftwf_plan_many_dft_r2c dtfft_interface_fftw_m Interface
fftwf_plan_many_r2r dtfft_interface_fftw_m Interface
free_datatypes dtfft_transpose_handle_host Subroutine

Frees temporary datatypes

free_mem dtfft_abstract_transpose_plan Subroutine

Frees memory based on backend

get_alloc_bytes dtfft_plan Function

Returns minimum number of bytes required to execute plan

get_alloc_size dtfft_plan Function

Wrapper around get_local_sizes to obtain number of elements only

get_aux_size dtfft_abstract_backend Function

Returns number of bytes required by aux buffer

get_aux_size dtfft_transpose_handle_cuda Function

Returns number of bytes required by aux buffer

get_backend dtfft_plan Function

Returns selected GPU backend during autotuning

get_backend dtfft_abstract_transpose_plan Function

Returns plan GPU backend

get_backend_from_env dtfft_utils Function

Returns GPU backend to use set by environment variable

get_cached_kernel dtfft_nvrtc_kernel Function

Returns cached kernel if it exists. If not returns null pointer.

get_code_init dtfft_nvrtc_kernel Subroutine

Generates basic code that is used in all other kernels

get_comm dtfft_api Function
get_contiguous_execution_blocks dtfft_nvrtc_kernel Subroutine
get_cuda_architecture dtfft_interface_cuda_runtime Interface

Returns the CUDA architecture for a given device.

get_datatype_from_env dtfft_utils Function

Obtains datatype id from environment variable

get_element_size dtfft_plan Function

Returns number of bytes required to store single element.

get_env dtfft_utils Interface

Obtains environment variable

get_env_base dtfft_utils Function

Base function of obtaining dtFFT environment variable

get_env_int32 dtfft_utils Function

Base Integer function of obtaining dtFFT environment variable

get_env_int8 dtfft_utils Function

Obtains int8 environment variable

get_env_logical dtfft_utils Function

Obtains logical environment variable

get_env_string dtfft_utils Function

Obtains string environment variable

get_inverse_kind dtfft_utils Function

Get the inverse R2R kind of transform for the given R2R kind

get_iters_from_env dtfft_utils Function

Obtains number of iterations from environment variable

get_local_size dtfft_pencil Subroutine

Computes local portions of data based on global count and position inside grid communicator

get_local_sizes dtfft_plan Subroutine

Obtain local starts and counts in real and fourier spaces

get_local_sizes dtfft_pencil Subroutine

Obtain local starts and counts in real and fourier spaces

get_log_enabled dtfft_utils Function

Returns the value of the log_enabled variable

get_mpi_enabled dtfft_config Function

Whether MPI backends are enabled or not

get_mpi_enabled_from_env dtfft_utils Function

Returns usage of MPI Backends during autotune set by environment variable

get_nccl_enabled dtfft_config Function

Whether NCCL backends are enabled or not

get_nccl_enabled_from_env dtfft_utils Function

Returns usage of NCCL Backends during autotune set by environment variable

get_neighbor_function_code dtfft_nvrtc_kernel Subroutine

Generated device function that is used to determite id of process that to which data is being sent or from which data has been recieved based on local element coordinate

get_nvshmem_enabled dtfft_config Function

Whether nvshmem backends are enabled or not

get_nvshmem_enabled_from_env dtfft_utils Function

Returns usage of NVSHMEM Backends during autotune set by environment variable

get_pencil dtfft_plan Function

Returns pencil decomposition

get_pipe_enabled_from_env dtfft_utils Function

Returns usage of Pipelined Backends during autotune set by environment variable

get_pipelined_enabled dtfft_config Function

Whether pipelined backends are enabled or not

get_plan_execution_time dtfft_transpose_plan_host Function

Creates transpose plan and executes it DTFFT_MEASURE_WARMUP_ITERS + DTFFT_MEASURE_ITERS times

Read more…
get_platform dtfft_plan Function

Returns execution platform of the plan (HOST or CUDA)

get_platform_from_env dtfft_utils Function

Returns execution platform set by environment variable

get_stream_int64 dtfft_plan Subroutine

Returns CUDA stream associated with plan

get_stream_ptr dtfft_plan Subroutine

Returns CUDA stream associated with plan

get_tile_size dtfft_nvrtc_kernel Function

Returns tile size to use in a tranpose kernel

get_tranpose_type dtfft_transpose_handle_cuda Function

Returns transpose_type, associated with handle

get_transpose_kernel_code dtfft_nvrtc_kernel Function

Generates code that will be used to locally tranpose data and prepares to send it to other processes ndims == 2

get_transpose_type dtfft_pencil Function

Determines transpose ID based on pencils

get_true_transpose_type dtfft_nvrtc_kernel Function

Returns generic transpose id. Since X-Y and Y-Z transpositions are symmectric, it returns only one of them. X-Z and Z-X are not symmetric

get_unpack_kernel_code dtfft_nvrtc_kernel Function

Generates code that will be used to unpack data when it is recieved

get_unpack_pipelined_kernel_code dtfft_nvrtc_kernel Function

Generates code that will be used to partially unpack data when it is recieved from other process

get_user_gpu_backend dtfft_config Function

Returns GPU backend set by the user or default one

get_user_platform dtfft_config Function

Returns platform set by the user or default one

get_user_stream dtfft_config Function

Returns either the custom provided by user or creates a new one

get_z_slab dtfft_config Function

Whether Z-slab optimization is enabled or not

get_z_slab_enabled dtfft_plan Function

Returns logical value is Z-slab optimization enabled internally

get_z_slab_from_env dtfft_utils Function

Returns Z-slab to be used set by environment variable

gpu_backend_eq dtfft_parameters Function
gpu_backend_ne dtfft_parameters Function
init_internal dtfft_utils Function

Checks if MPI is initialized and loads environment variables

int_to_str dtfft_utils Interface

Converts integer to string

int_to_str_int32 dtfft_utils Function

Convert 32-bit integer to string

int_to_str_int64 dtfft_utils Function

Convert 64-bit integer to string

int_to_str_int8 dtfft_utils Function

Convert 8-bit integer to string

is_backend_mpi dtfft_parameters Function
is_backend_nccl dtfft_parameters Function
is_backend_nvshmem dtfft_parameters Function
is_backend_pipelined dtfft_parameters Function
is_cuda_executor dtfft_parameters Function
is_device_ptr dtfft_utils Interface

Checks if pointer can be accessed from device

is_host_executor dtfft_parameters Function
is_null_funptr dtfft_utils Function

Checks if pointer is NULL

is_null_ptr dtfft_utils Function

Checks if pointer is NULL

is_null_ptr dtfft_utils Interface

Checks if pointer is NULL

is_nvshmem_ptr dtfft_interface_nvshmem Function

Checks if pointer is a symmetric nvshmem allocated pointer

is_same_ptr dtfft_utils Function

Checks if two pointer are the same

is_valid_comm_type dtfft_parameters Function
is_valid_dimension dtfft_parameters Function
is_valid_effort dtfft_parameters Function
is_valid_execute_type dtfft_parameters Function
is_valid_executor dtfft_parameters Function
is_valid_gpu_backend dtfft_parameters Function
is_valid_platform dtfft_parameters Function
is_valid_precision dtfft_parameters Function
is_valid_r2r_kind dtfft_parameters Function
is_valid_transpose_type dtfft_parameters Function
load dtfft_interface_vkfft_m Function

Loads VkFFT library

load_cuda dtfft_interface_cuda Function

Loads the CUDA Driver library and needed symbols

load_library dtfft_utils Function

Dynamically loads library

load_nvrtc dtfft_interface_nvrtc Function

Dynamically loads nvRTC library and its functions

load_symbol dtfft_utils Function

Dynamically loads symbol from library

load_vkfft dtfft_interface_vkfft_m Function

Loads VkFFT library based on the platform

make_plan dtfft_executor_mkl_m Subroutine

Creates general MKL plan

make_public dtfft_pencil Function

Creates public object that users can use to create own FFT backends

mark_unused dtfft_nvrtc_kernel Subroutine

Takes CUDA kernel as an argument and searches for it in cache If kernel is found than reduces ref_count and return null pointer

mem_alloc dtfft_executor_cufft_m Subroutine

Dummy method. Raises error stop

mem_alloc dtfft_executor_mkl_m Subroutine

Allocates MKL memory

mem_alloc dtfft_abstract_transpose_plan Subroutine

Allocates memory based on selected backend

mem_alloc dtfft_executor_fftw_m Subroutine

Allocates FFTW3 memory

mem_alloc dtfft_executor_vkfft_m Subroutine

Dummy method. Raises error stop

mem_alloc_c32_1d dtfft_plan Subroutine

Allocates pointer of rank 1

mem_alloc_c32_2d dtfft_plan Subroutine

Allocates pointer of rank 2

mem_alloc_c32_3d dtfft_plan Subroutine

Allocates pointer of rank 3

mem_alloc_c64_1d dtfft_plan Subroutine

Allocates pointer of rank 1

mem_alloc_c64_2d dtfft_plan Subroutine

Allocates pointer of rank 2

mem_alloc_c64_3d dtfft_plan Subroutine

Allocates pointer of rank 3

mem_alloc_host dtfft_utils Interface

Allocates memory using C11 Standard alloc_align with 16 bytes alignment

mem_alloc_ptr dtfft_plan Subroutine

Allocates memory specific for this plan

mem_alloc_r32_1d dtfft_plan Subroutine

Allocates pointer of rank 1

mem_alloc_r32_2d dtfft_plan Subroutine

Allocates pointer of rank 2

mem_alloc_r32_3d dtfft_plan Subroutine

Allocates pointer of rank 3

mem_alloc_r64_1d dtfft_plan Subroutine

Allocates pointer of rank 1

mem_alloc_r64_2d dtfft_plan Subroutine

Allocates pointer of rank 2

mem_alloc_r64_3d dtfft_plan Subroutine

Allocates pointer of rank 3

mem_free dtfft_executor_cufft_m Subroutine

Dummy method. Raises error stop

mem_free dtfft_executor_mkl_m Subroutine

Frees MKL aligned memory

mem_free dtfft_abstract_transpose_plan Subroutine

Frees memory allocated with mem_alloc

mem_free dtfft_executor_fftw_m Subroutine

Frees FFTW3 aligned memory

mem_free dtfft_executor_vkfft_m Subroutine

Dummy method. Raises error stop

mem_free_c32_1d dtfft_plan Subroutine

Frees previously allocated memory specific for this plan

mem_free_c32_2d dtfft_plan Subroutine

Frees previously allocated memory specific for this plan

mem_free_c32_3d dtfft_plan Subroutine

Frees previously allocated memory specific for this plan

mem_free_c64_1d dtfft_plan Subroutine

Frees previously allocated memory specific for this plan

mem_free_c64_2d dtfft_plan Subroutine

Frees previously allocated memory specific for this plan

mem_free_c64_3d dtfft_plan Subroutine

Frees previously allocated memory specific for this plan

mem_free_host dtfft_utils Interface

Frees memory allocated with mem_alloc_host

mem_free_ptr dtfft_plan Subroutine

Frees previously allocated memory specific for this plan

mem_free_r32_1d dtfft_plan Subroutine

Frees previously allocated memory specific for this plan

mem_free_r32_2d dtfft_plan Subroutine

Frees previously allocated memory specific for this plan

mem_free_r32_3d dtfft_plan Subroutine

Frees previously allocated memory specific for this plan

mem_free_r64_1d dtfft_plan Subroutine

Frees previously allocated memory specific for this plan

mem_free_r64_2d dtfft_plan Subroutine

Frees previously allocated memory specific for this plan

mem_free_r64_3d dtfft_plan Subroutine

Frees previously allocated memory specific for this plan

mkl_dfti_commit_desc dtfft_interface_mkl_m Interface

Performs all initialization for the actual FFT computation.

mkl_dfti_create_desc dtfft_interface_mkl_m Interface

Allocates the descriptor data structure and initializes it with default configuration values.

mkl_dfti_execute dtfft_interface_mkl_m Interface

Computes FFT.

mkl_dfti_free_desc dtfft_interface_mkl_m Interface

Frees the memory allocated for a descriptor.

mkl_dfti_mem_alloc dtfft_interface_mkl_m Interface

Allocates pointer via mkl_malloc

mkl_dfti_mem_free dtfft_interface_mkl_m Interface

Frees pointer via mkl_free

mkl_dfti_set_value dtfft_interface_mkl_m Interface

Sets one particular configuration parameter with the specified configuration value.

ncclCommDeregister dtfft_interface_nccl Interface

Deregister a buffer for collective communication.

ncclCommDestroy dtfft_interface_nccl Interface

Destroy a communicator object comm.

ncclCommInitRank dtfft_interface_nccl Interface

Creates a new communicator (multi thread/process version).

Read more…
ncclCommRegister dtfft_interface_nccl Interface

Register a buffer for collective communication.

ncclGetErrorString dtfft_interface_nccl Function

Generates an error message.

ncclGetErrorString_c dtfft_interface_nccl Interface

Returns a human-readable string corresponding to the passed error code.

ncclGetUniqueId dtfft_interface_nccl Interface

Generates an Id to be used in ncclCommInitRank. ncclGetUniqueId should be called once when creating a communicator and the Id should be distributed to all ranks in the communicator before calling ncclCommInitRank. uniqueId should point to a ncclUniqueId object allocated by the user.

ncclGroupEnd dtfft_interface_nccl Interface

End a group call.

Read more…
ncclGroupStart dtfft_interface_nccl Interface

Start a group call.

Read more…
ncclMemAlloc dtfft_interface_nccl Interface

Allocate a GPU buffer with size. Allocated buffer head address will be returned by ptr, and the actual allocated size can be larger than requested because of the buffer granularity requirements from all types of NCCL optimizations.

ncclMemFree dtfft_interface_nccl Interface

Free memory allocated by ncclMemAlloc().

ncclRecv dtfft_interface_nccl Interface

Receive data from rank peer into recvbuff.

Read more…
ncclSend dtfft_interface_nccl Interface

Send data from sendbuff to rank peer.

Read more…
nvrtcGetErrorString dtfft_interface_nvrtc Function

Helper function that returns a string describing the given nvrtcResult code For unrecognized enumeration values, it returns “NVRTC_ERROR unknown”

nvshmem_free dtfft_interface_nvshmem Interface
nvshmem_malloc dtfft_interface_nvshmem Interface
nvshmem_my_pe dtfft_interface_nvshmem Interface
nvshmem_ptr dtfft_interface_nvshmem Interface
nvshmemx_float_alltoall_on_stream dtfft_interface_nvshmem Interface
nvshmemx_init_status dtfft_interface_nvshmem Interface
nvshmemx_sync_all_on_stream dtfft_interface_nvshmem Interface
nvtxDomainCreate_c dtfft_interface_nvtx Interface

Creates an NVTX domain with the specified name.

nvtxDomainRangePop_c dtfft_interface_nvtx Interface

Pops a range from the specified NVTX domain.

nvtxDomainRangePushEx_c dtfft_interface_nvtx Interface

Pushes a range with a custom message and color onto the specified NVTX domain.

operator(/=) dtfft_parameters Interface
operator(==) dtfft_parameters Interface
platform_eq dtfft_parameters Function
platform_ne dtfft_parameters Function
pop_nvtx_domain_range dtfft_interface_nvtx Subroutine

Pops a range from the NVTX domain

precision_eq dtfft_parameters Function
precision_ne dtfft_parameters Function
push_nvtx_domain_range dtfft_interface_nvtx Subroutine

Pushes a range to the NVTX domain

r2r_kind_eq dtfft_parameters Function
r2r_kind_ne dtfft_parameters Function
report dtfft_plan Subroutine

Prints plan-related information to stdout

run_autotune_backend dtfft_transpose_plan_cuda Subroutine

Runs autotune for all backends

run_cuda_kernel dtfft_interface_cuda Interface

Launches a CUDA function CUfunction or a CUDA kernel CUkernel.

run_mpi_a2a dtfft_backend_mpi Subroutine

Executes MPI all-to-all communication

run_mpi_p2p dtfft_backend_mpi Subroutine

Executes MPI point-to-point communication

set_unpack_kernel dtfft_abstract_backend Subroutine

Sets unpack kernel for pipelined backend

stream_from_int64 dtfft_parameters Function

Creates dtfft_stream_t from integer(cuda_stream_kind)

string dtfft_utils Interface

Creates string object

string_c2f dtfft_utils Subroutine

Convert C string to Fortran string

string_constructor dtfft_utils Function

Creates string object

string_f2c dtfft_utils Subroutine

Convert Fortran string to C string

to_cstr dtfft_nvrtc_kernel Subroutine

Converts Fortran CUDA code to C pointer

transpose dtfft_plan Subroutine

Performs single transposition

Read more…
transpose_ptr dtfft_plan Subroutine

Performs single transposition using type(c_ptr) pointers instead of buffers

Read more…
transpose_type_eq dtfft_parameters Function
transpose_type_ne dtfft_parameters Function
unload_library dtfft_utils Subroutine

Unloads library

write_message dtfft_utils Subroutine

Write message to the specified unit

call~~graph~~CallGraph interface~comm_f2c Comm_f2c interface~cudadevicesynchronize cudaDeviceSynchronize interface~cudaeventcreate cudaEventCreate interface~cudaeventcreatewithflags cudaEventCreateWithFlags interface~cudaeventdestroy cudaEventDestroy interface~cudaeventelapsedtime cudaEventElapsedTime interface~cudaeventrecord cudaEventRecord interface~cudaeventsynchronize cudaEventSynchronize interface~cudafree cudaFree interface~cudagetdevice cudaGetDevice interface~cudagetdevicecount cudaGetDeviceCount interface~cudageterrorstring_c cudaGetErrorString_c interface~cudamalloc cudaMalloc interface~cudamemcpy cudaMemcpy interface~cudamemcpyasync cudaMemcpyAsync interface~cudamemgetinfo cudaMemGetInfo interface~cudamemset cudaMemset interface~cudasetdevice cudaSetDevice interface~cudastreamcreate cudaStreamCreate interface~cudastreamdestroy cudaStreamDestroy interface~cudastreamquery cudaStreamQuery interface~cudastreamsynchronize cudaStreamSynchronize interface~cudastreamwaitevent cudaStreamWaitEvent interface~cufftdestroy cufftDestroy interface~cufftmpattachreshapecomm cufftMpAttachReshapeComm interface~cufftmpcreatereshape cufftMpCreateReshape interface~cufftmpdestroyreshape cufftMpDestroyReshape interface~cufftmpexecreshapeasync cufftMpExecReshapeAsync interface~cufftmpgetreshapesize cufftMpGetReshapeSize interface~cufftmpmakereshape cufftMpMakeReshape interface~cufftplanmany cufftPlanMany interface~cufftsetstream cufftSetStream interface~cufftxtexec cufftXtExec interface~dftierrormessage_c DftiErrorMessage_c interface~dlclose dlclose interface~dlerror dlerror interface~dlopen dlopen interface~dlsym dlsym interface~dtfft_config_t dtfft_config_t proc~config_constructor config_constructor interface~dtfft_config_t->proc~config_constructor interface~dtfft_get_version dtfft_get_version proc~dtfft_get_version_current dtfft_get_version_current interface~dtfft_get_version->proc~dtfft_get_version_current proc~dtfft_get_version_required dtfft_get_version_required interface~dtfft_get_version->proc~dtfft_get_version_required interface~dtfft_stream_t dtfft_stream_t proc~stream_from_int64 stream_from_int64 interface~dtfft_stream_t->proc~stream_from_int64 interface~fftw_execute_dft fftw_execute_dft interface~fftw_execute_dft_c2r fftw_execute_dft_c2r interface~fftw_execute_dft_r2c fftw_execute_dft_r2c interface~fftw_execute_r2r fftw_execute_r2r interface~fftw_plan_many_dft fftw_plan_many_dft interface~fftw_plan_many_dft_c2r fftw_plan_many_dft_c2r interface~fftw_plan_many_dft_r2c fftw_plan_many_dft_r2c interface~fftw_plan_many_r2r fftw_plan_many_r2r interface~fftwf_execute_dft fftwf_execute_dft interface~fftwf_execute_dft_c2r fftwf_execute_dft_c2r interface~fftwf_execute_dft_r2c fftwf_execute_dft_r2c interface~fftwf_execute_r2r fftwf_execute_r2r interface~fftwf_plan_many_dft fftwf_plan_many_dft interface~fftwf_plan_many_dft_c2r fftwf_plan_many_dft_c2r interface~fftwf_plan_many_dft_r2c fftwf_plan_many_dft_r2c interface~fftwf_plan_many_r2r fftwf_plan_many_r2r interface~get_cuda_architecture get_cuda_architecture interface~get_env get_env proc~get_env_base get_env_base interface~get_env->proc~get_env_base proc~get_env_int32 get_env_int32 interface~get_env->proc~get_env_int32 proc~get_env_int8 get_env_int8 interface~get_env->proc~get_env_int8 proc~get_env_logical get_env_logical interface~get_env->proc~get_env_logical proc~get_env_string get_env_string interface~get_env->proc~get_env_string interface~int_to_str int_to_str proc~int_to_str_int32 int_to_str_int32 interface~int_to_str->proc~int_to_str_int32 proc~int_to_str_int64 int_to_str_int64 interface~int_to_str->proc~int_to_str_int64 proc~int_to_str_int8 int_to_str_int8 interface~int_to_str->proc~int_to_str_int8 interface~is_device_ptr is_device_ptr interface~is_null_ptr is_null_ptr interface~is_null_ptr->interface~is_null_ptr proc~is_null_funptr is_null_funptr interface~is_null_ptr->proc~is_null_funptr interface~mem_alloc_host mem_alloc_host interface~mem_free_host mem_free_host interface~mkl_dfti_commit_desc mkl_dfti_commit_desc interface~mkl_dfti_create_desc mkl_dfti_create_desc interface~mkl_dfti_execute mkl_dfti_execute interface~mkl_dfti_free_desc mkl_dfti_free_desc interface~mkl_dfti_mem_alloc mkl_dfti_mem_alloc interface~mkl_dfti_mem_free mkl_dfti_mem_free interface~mkl_dfti_set_value mkl_dfti_set_value interface~ncclcommderegister ncclCommDeregister interface~ncclcommdestroy ncclCommDestroy interface~ncclcomminitrank ncclCommInitRank interface~ncclcommregister ncclCommRegister interface~ncclgeterrorstring_c ncclGetErrorString_c interface~ncclgetuniqueid ncclGetUniqueId interface~ncclgroupend ncclGroupEnd interface~ncclgroupstart ncclGroupStart interface~ncclmemalloc ncclMemAlloc interface~ncclmemfree ncclMemFree interface~ncclrecv ncclRecv interface~ncclsend ncclSend interface~nvshmem_free nvshmem_free interface~nvshmem_malloc nvshmem_malloc interface~nvshmem_my_pe nvshmem_my_pe interface~nvshmem_ptr nvshmem_ptr interface~nvshmemx_float_alltoall_on_stream nvshmemx_float_alltoall_on_stream interface~nvshmemx_init_status nvshmemx_init_status interface~nvshmemx_sync_all_on_stream nvshmemx_sync_all_on_stream interface~nvtxdomaincreate_c nvtxDomainCreate_c interface~nvtxdomainrangepop_c nvtxDomainRangePop_c interface~nvtxdomainrangepushex_c nvtxDomainRangePushEx_c interface~operator(==) operator(==) proc~effort_eq effort_eq interface~operator(==)->proc~effort_eq proc~execute_type_eq execute_type_eq interface~operator(==)->proc~execute_type_eq proc~executor_eq executor_eq interface~operator(==)->proc~executor_eq proc~gpu_backend_eq gpu_backend_eq interface~operator(==)->proc~gpu_backend_eq proc~platform_eq platform_eq interface~operator(==)->proc~platform_eq proc~precision_eq precision_eq interface~operator(==)->proc~precision_eq proc~r2r_kind_eq r2r_kind_eq interface~operator(==)->proc~r2r_kind_eq proc~transpose_type_eq transpose_type_eq interface~operator(==)->proc~transpose_type_eq interface~operator(SLASH=) operator(/=) proc~effort_ne effort_ne interface~operator(SLASH=)->proc~effort_ne proc~execute_type_ne execute_type_ne interface~operator(SLASH=)->proc~execute_type_ne proc~executor_ne executor_ne interface~operator(SLASH=)->proc~executor_ne proc~gpu_backend_ne gpu_backend_ne interface~operator(SLASH=)->proc~gpu_backend_ne proc~platform_ne platform_ne interface~operator(SLASH=)->proc~platform_ne proc~precision_ne precision_ne interface~operator(SLASH=)->proc~precision_ne proc~r2r_kind_ne r2r_kind_ne interface~operator(SLASH=)->proc~r2r_kind_ne proc~transpose_type_ne transpose_type_ne interface~operator(SLASH=)->proc~transpose_type_ne interface~run_cuda_kernel run_cuda_kernel interface~string string proc~string_constructor string_constructor interface~string->proc~string_constructor none~get_stream dtfft_plan_t%get_stream proc~get_stream_int64 dtfft_plan_t%get_stream_int64 none~get_stream->proc~get_stream_int64 proc~get_stream_ptr dtfft_plan_t%get_stream_ptr none~get_stream->proc~get_stream_ptr none~get_stream~2 dtfft_plan_r2c_t%get_stream none~get_stream~2->proc~get_stream_int64 none~get_stream~2->proc~get_stream_ptr none~get_stream~3 dtfft_plan_r2r_t%get_stream none~get_stream~3->proc~get_stream_int64 none~get_stream~3->proc~get_stream_ptr none~get_stream~4 dtfft_plan_c2c_t%get_stream none~get_stream~4->proc~get_stream_int64 none~get_stream~4->proc~get_stream_ptr none~get_stream~5 dtfft_core_c2c%get_stream none~get_stream~5->proc~get_stream_int64 none~get_stream~5->proc~get_stream_ptr none~mem_alloc dtfft_plan_t%mem_alloc proc~mem_alloc_c32_1d dtfft_plan_t%mem_alloc_c32_1d none~mem_alloc->proc~mem_alloc_c32_1d proc~mem_alloc_c32_2d dtfft_plan_t%mem_alloc_c32_2d none~mem_alloc->proc~mem_alloc_c32_2d proc~mem_alloc_c32_3d dtfft_plan_t%mem_alloc_c32_3d none~mem_alloc->proc~mem_alloc_c32_3d proc~mem_alloc_c64_1d dtfft_plan_t%mem_alloc_c64_1d none~mem_alloc->proc~mem_alloc_c64_1d proc~mem_alloc_c64_2d dtfft_plan_t%mem_alloc_c64_2d none~mem_alloc->proc~mem_alloc_c64_2d proc~mem_alloc_c64_3d dtfft_plan_t%mem_alloc_c64_3d none~mem_alloc->proc~mem_alloc_c64_3d proc~mem_alloc_r32_1d dtfft_plan_t%mem_alloc_r32_1d none~mem_alloc->proc~mem_alloc_r32_1d proc~mem_alloc_r32_2d dtfft_plan_t%mem_alloc_r32_2d none~mem_alloc->proc~mem_alloc_r32_2d proc~mem_alloc_r32_3d dtfft_plan_t%mem_alloc_r32_3d none~mem_alloc->proc~mem_alloc_r32_3d proc~mem_alloc_r64_1d dtfft_plan_t%mem_alloc_r64_1d none~mem_alloc->proc~mem_alloc_r64_1d proc~mem_alloc_r64_2d dtfft_plan_t%mem_alloc_r64_2d none~mem_alloc->proc~mem_alloc_r64_2d proc~mem_alloc_r64_3d dtfft_plan_t%mem_alloc_r64_3d none~mem_alloc->proc~mem_alloc_r64_3d none~mem_alloc~10 dtfft_plan_r2c_t%mem_alloc none~mem_alloc~10->proc~mem_alloc_c32_1d none~mem_alloc~10->proc~mem_alloc_c32_2d none~mem_alloc~10->proc~mem_alloc_c32_3d none~mem_alloc~10->proc~mem_alloc_c64_1d none~mem_alloc~10->proc~mem_alloc_c64_2d none~mem_alloc~10->proc~mem_alloc_c64_3d none~mem_alloc~10->proc~mem_alloc_r32_1d none~mem_alloc~10->proc~mem_alloc_r32_2d none~mem_alloc~10->proc~mem_alloc_r32_3d none~mem_alloc~10->proc~mem_alloc_r64_1d none~mem_alloc~10->proc~mem_alloc_r64_2d none~mem_alloc~10->proc~mem_alloc_r64_3d none~mem_alloc~11 dtfft_core_c2c%mem_alloc none~mem_alloc~11->proc~mem_alloc_c32_1d none~mem_alloc~11->proc~mem_alloc_c32_2d none~mem_alloc~11->proc~mem_alloc_c32_3d none~mem_alloc~11->proc~mem_alloc_c64_1d none~mem_alloc~11->proc~mem_alloc_c64_2d none~mem_alloc~11->proc~mem_alloc_c64_3d none~mem_alloc~11->proc~mem_alloc_r32_1d none~mem_alloc~11->proc~mem_alloc_r32_2d none~mem_alloc~11->proc~mem_alloc_r32_3d none~mem_alloc~11->proc~mem_alloc_r64_1d none~mem_alloc~11->proc~mem_alloc_r64_2d none~mem_alloc~11->proc~mem_alloc_r64_3d none~mem_alloc~8 dtfft_plan_r2r_t%mem_alloc none~mem_alloc~8->proc~mem_alloc_c32_1d none~mem_alloc~8->proc~mem_alloc_c32_2d none~mem_alloc~8->proc~mem_alloc_c32_3d none~mem_alloc~8->proc~mem_alloc_c64_1d none~mem_alloc~8->proc~mem_alloc_c64_2d none~mem_alloc~8->proc~mem_alloc_c64_3d none~mem_alloc~8->proc~mem_alloc_r32_1d none~mem_alloc~8->proc~mem_alloc_r32_2d none~mem_alloc~8->proc~mem_alloc_r32_3d none~mem_alloc~8->proc~mem_alloc_r64_1d none~mem_alloc~8->proc~mem_alloc_r64_2d none~mem_alloc~8->proc~mem_alloc_r64_3d none~mem_alloc~9 dtfft_plan_c2c_t%mem_alloc none~mem_alloc~9->proc~mem_alloc_c32_1d none~mem_alloc~9->proc~mem_alloc_c32_2d none~mem_alloc~9->proc~mem_alloc_c32_3d none~mem_alloc~9->proc~mem_alloc_c64_1d none~mem_alloc~9->proc~mem_alloc_c64_2d none~mem_alloc~9->proc~mem_alloc_c64_3d none~mem_alloc~9->proc~mem_alloc_r32_1d none~mem_alloc~9->proc~mem_alloc_r32_2d none~mem_alloc~9->proc~mem_alloc_r32_3d none~mem_alloc~9->proc~mem_alloc_r64_1d none~mem_alloc~9->proc~mem_alloc_r64_2d none~mem_alloc~9->proc~mem_alloc_r64_3d none~mem_free dtfft_plan_t%mem_free proc~mem_free_c32_1d dtfft_plan_t%mem_free_c32_1d none~mem_free->proc~mem_free_c32_1d proc~mem_free_c32_2d dtfft_plan_t%mem_free_c32_2d none~mem_free->proc~mem_free_c32_2d proc~mem_free_c32_3d dtfft_plan_t%mem_free_c32_3d none~mem_free->proc~mem_free_c32_3d proc~mem_free_c64_1d dtfft_plan_t%mem_free_c64_1d none~mem_free->proc~mem_free_c64_1d proc~mem_free_c64_2d dtfft_plan_t%mem_free_c64_2d none~mem_free->proc~mem_free_c64_2d proc~mem_free_c64_3d dtfft_plan_t%mem_free_c64_3d none~mem_free->proc~mem_free_c64_3d proc~mem_free_r32_1d dtfft_plan_t%mem_free_r32_1d none~mem_free->proc~mem_free_r32_1d proc~mem_free_r32_2d dtfft_plan_t%mem_free_r32_2d none~mem_free->proc~mem_free_r32_2d proc~mem_free_r32_3d dtfft_plan_t%mem_free_r32_3d none~mem_free->proc~mem_free_r32_3d proc~mem_free_r64_1d dtfft_plan_t%mem_free_r64_1d none~mem_free->proc~mem_free_r64_1d proc~mem_free_r64_2d dtfft_plan_t%mem_free_r64_2d none~mem_free->proc~mem_free_r64_2d proc~mem_free_r64_3d dtfft_plan_t%mem_free_r64_3d none~mem_free->proc~mem_free_r64_3d none~mem_free~10 dtfft_plan_r2c_t%mem_free none~mem_free~10->proc~mem_free_c32_1d none~mem_free~10->proc~mem_free_c32_2d none~mem_free~10->proc~mem_free_c32_3d none~mem_free~10->proc~mem_free_c64_1d none~mem_free~10->proc~mem_free_c64_2d none~mem_free~10->proc~mem_free_c64_3d none~mem_free~10->proc~mem_free_r32_1d none~mem_free~10->proc~mem_free_r32_2d none~mem_free~10->proc~mem_free_r32_3d none~mem_free~10->proc~mem_free_r64_1d none~mem_free~10->proc~mem_free_r64_2d none~mem_free~10->proc~mem_free_r64_3d none~mem_free~11 dtfft_core_c2c%mem_free none~mem_free~11->proc~mem_free_c32_1d none~mem_free~11->proc~mem_free_c32_2d none~mem_free~11->proc~mem_free_c32_3d none~mem_free~11->proc~mem_free_c64_1d none~mem_free~11->proc~mem_free_c64_2d none~mem_free~11->proc~mem_free_c64_3d none~mem_free~11->proc~mem_free_r32_1d none~mem_free~11->proc~mem_free_r32_2d none~mem_free~11->proc~mem_free_r32_3d none~mem_free~11->proc~mem_free_r64_1d none~mem_free~11->proc~mem_free_r64_2d none~mem_free~11->proc~mem_free_r64_3d none~mem_free~8 dtfft_plan_r2r_t%mem_free none~mem_free~8->proc~mem_free_c32_1d none~mem_free~8->proc~mem_free_c32_2d none~mem_free~8->proc~mem_free_c32_3d none~mem_free~8->proc~mem_free_c64_1d none~mem_free~8->proc~mem_free_c64_2d none~mem_free~8->proc~mem_free_c64_3d none~mem_free~8->proc~mem_free_r32_1d none~mem_free~8->proc~mem_free_r32_2d none~mem_free~8->proc~mem_free_r32_3d none~mem_free~8->proc~mem_free_r64_1d none~mem_free~8->proc~mem_free_r64_2d none~mem_free~8->proc~mem_free_r64_3d none~mem_free~9 dtfft_plan_c2c_t%mem_free none~mem_free~9->proc~mem_free_c32_1d none~mem_free~9->proc~mem_free_c32_2d none~mem_free~9->proc~mem_free_c32_3d none~mem_free~9->proc~mem_free_c64_1d none~mem_free~9->proc~mem_free_c64_2d none~mem_free~9->proc~mem_free_c64_3d none~mem_free~9->proc~mem_free_r32_1d none~mem_free~9->proc~mem_free_r32_2d none~mem_free~9->proc~mem_free_r32_3d none~mem_free~9->proc~mem_free_r64_1d none~mem_free~9->proc~mem_free_r64_2d none~mem_free~9->proc~mem_free_r64_3d proc~add_line kernel_code%add_line proc~alloc_and_set_aux alloc_and_set_aux proc~alloc_mem alloc_mem proc~alloc_and_set_aux->proc~alloc_mem proc~dtfft_get_error_string dtfft_get_error_string proc~alloc_and_set_aux->proc~dtfft_get_error_string proc~get_aux_size~2 transpose_handle_cuda%get_aux_size proc~alloc_and_set_aux->proc~get_aux_size~2 mpi_abort mpi_abort proc~alloc_and_set_aux->mpi_abort mpi_allreduce mpi_allreduce proc~alloc_and_set_aux->mpi_allreduce proc~alloc_fft_plans dtfft_plan_t%alloc_fft_plans proc~alloc_mem->interface~cudamalloc proc~alloc_mem->interface~cudamemgetinfo proc~alloc_mem->interface~int_to_str proc~alloc_mem->interface~ncclcommregister proc~alloc_mem->interface~ncclmemalloc proc~alloc_mem->interface~nvshmem_malloc proc~cudageterrorstring cudaGetErrorString proc~alloc_mem->proc~cudageterrorstring proc~dtfft_get_backend_string dtfft_get_backend_string proc~alloc_mem->proc~dtfft_get_backend_string proc~get_log_enabled get_log_enabled proc~alloc_mem->proc~get_log_enabled proc~is_backend_nccl is_backend_nccl proc~alloc_mem->proc~is_backend_nccl proc~is_backend_nvshmem is_backend_nvshmem proc~alloc_mem->proc~is_backend_nvshmem proc~ncclgeterrorstring ncclGetErrorString proc~alloc_mem->proc~ncclgeterrorstring proc~write_message write_message proc~alloc_mem->proc~write_message is_null_ptr is_null_ptr proc~alloc_mem->is_null_ptr proc~alloc_mem->mpi_abort proc~alloc_mem->mpi_allreduce temp temp proc~alloc_mem->temp proc~astring_f2c astring_f2c proc~string_f2c string_f2c proc~astring_f2c->proc~string_f2c proc~autotune_grid autotune_grid proc~autotune_grid->interface~int_to_str proc~create_cart_comm create_cart_comm proc~autotune_grid->proc~create_cart_comm proc~create~8 pencil%create proc~autotune_grid->proc~create~8 proc~destroy~8 pencil%destroy proc~autotune_grid->proc~destroy~8 proc~autotune_grid->proc~get_log_enabled proc~pop_nvtx_domain_range pop_nvtx_domain_range proc~autotune_grid->proc~pop_nvtx_domain_range proc~push_nvtx_domain_range push_nvtx_domain_range proc~autotune_grid->proc~push_nvtx_domain_range proc~run_autotune_backend run_autotune_backend proc~autotune_grid->proc~run_autotune_backend proc~autotune_grid->proc~write_message mpi_comm_free mpi_comm_free proc~autotune_grid->mpi_comm_free proc~autotune_grid_decomposition autotune_grid_decomposition proc~autotune_grid_decomposition->proc~autotune_grid mpi_comm_size mpi_comm_size proc~autotune_grid_decomposition->mpi_comm_size proc~autotune_grid_decomposition~2 transpose_plan_host%autotune_grid_decomposition proc~autotune_grid_decomposition~2->interface~int_to_str proc~autotune_grid~2 transpose_plan_host%autotune_grid proc~autotune_grid_decomposition~2->proc~autotune_grid~2 proc~autotune_grid_decomposition~2->proc~get_log_enabled proc~autotune_grid_decomposition~2->proc~write_message proc~autotune_grid_decomposition~2->mpi_comm_size proc~autotune_grid~2->interface~int_to_str proc~autotune_mpi_datatypes transpose_plan_host%autotune_mpi_datatypes proc~autotune_grid~2->proc~autotune_mpi_datatypes proc~autotune_grid~2->proc~create_cart_comm proc~autotune_grid~2->proc~create~8 proc~autotune_grid~2->proc~destroy~8 proc~double_to_str double_to_str proc~autotune_grid~2->proc~double_to_str proc~get_local_sizes~2 get_local_sizes proc~autotune_grid~2->proc~get_local_sizes~2 proc~autotune_grid~2->proc~get_log_enabled proc~get_plan_execution_time transpose_plan_host%get_plan_execution_time proc~autotune_grid~2->proc~get_plan_execution_time proc~autotune_grid~2->proc~pop_nvtx_domain_range proc~autotune_grid~2->proc~push_nvtx_domain_range proc~autotune_grid~2->proc~write_message proc~autotune_grid~2->mpi_comm_free proc~autotune_transpose_id transpose_plan_host%autotune_transpose_id proc~autotune_mpi_datatypes->proc~autotune_transpose_id proc~autotune_transpose_id->proc~get_plan_execution_time proc~check_aux dtfft_plan_t%check_aux proc~check_aux->proc~dtfft_get_error_string proc~get_alloc_size dtfft_plan_t%get_alloc_size proc~check_aux->proc~get_alloc_size proc~get_element_size dtfft_plan_t%get_element_size proc~check_aux->proc~get_element_size proc~mem_alloc_ptr dtfft_plan_t%mem_alloc_ptr proc~check_aux->proc~mem_alloc_ptr proc~check_aux->is_null_ptr proc~check_aux->mpi_abort proc~check_create_args dtfft_plan_t%check_create_args proc~get_user_platform get_user_platform proc~check_create_args->proc~get_user_platform proc~init_internal init_internal proc~check_create_args->proc~init_internal proc~is_cuda_executor is_cuda_executor proc~check_create_args->proc~is_cuda_executor proc~is_host_executor is_host_executor proc~check_create_args->proc~is_host_executor proc~is_valid_comm_type is_valid_comm_type proc~check_create_args->proc~is_valid_comm_type proc~is_valid_dimension is_valid_dimension proc~check_create_args->proc~is_valid_dimension proc~is_valid_effort is_valid_effort proc~check_create_args->proc~is_valid_effort proc~is_valid_executor is_valid_executor proc~check_create_args->proc~is_valid_executor proc~is_valid_precision is_valid_precision proc~check_create_args->proc~is_valid_precision proc~is_valid_r2r_kind is_valid_r2r_kind proc~check_create_args->proc~is_valid_r2r_kind mpi_topo_test mpi_topo_test proc~check_create_args->mpi_topo_test proc~check_device_pointers check_device_pointers proc~check_device_pointers->interface~is_device_ptr proc~check_device_pointers->proc~is_backend_nvshmem proc~is_nvshmem_ptr is_nvshmem_ptr proc~check_device_pointers->proc~is_nvshmem_ptr proc~check_device_pointers->is_null_ptr proc~clean_unused_cache clean_unused_cache proc~clean_unused_cache->interface~int_to_str proc~clean_unused_cache->proc~cudageterrorstring proc~clean_unused_cache->is_null_ptr proc~clean_unused_cache->mpi_abort proc~compile_and_cache compile_and_cache proc~compile_and_cache->interface~cudagetdevice proc~compile_and_cache->interface~get_cuda_architecture proc~compile_and_cache->interface~int_to_str proc~compile_and_cache->proc~astring_f2c proc~compile_and_cache->proc~cudageterrorstring proc~destroy_code kernel_code%destroy_code proc~compile_and_cache->proc~destroy_code proc~get_cached_kernel get_cached_kernel proc~compile_and_cache->proc~get_cached_kernel proc~get_transpose_kernel_code get_transpose_kernel_code proc~compile_and_cache->proc~get_transpose_kernel_code proc~get_true_transpose_type get_true_transpose_type proc~compile_and_cache->proc~get_true_transpose_type proc~get_unpack_kernel_code get_unpack_kernel_code proc~compile_and_cache->proc~get_unpack_kernel_code proc~get_unpack_pipelined_kernel_code get_unpack_pipelined_kernel_code proc~compile_and_cache->proc~get_unpack_pipelined_kernel_code proc~nvrtcgeterrorstring nvrtcGetErrorString proc~compile_and_cache->proc~nvrtcgeterrorstring proc~compile_and_cache->proc~pop_nvtx_domain_range proc~compile_and_cache->proc~push_nvtx_domain_range proc~string_c2f string_c2f proc~compile_and_cache->proc~string_c2f proc~to_cstr kernel_code%to_cstr proc~compile_and_cache->proc~to_cstr proc~compile_and_cache->is_null_ptr proc~compile_and_cache->mpi_abort proc~compile_and_cache->mpi_allreduce mpi_comm_rank mpi_comm_rank proc~compile_and_cache->mpi_comm_rank proc~count_unique count_unique proc~create cufft_executor%create proc~create->interface~cufftplanmany proc~create->interface~cufftsetstream proc~create->interface~int_to_str proc~cufftgeterrorstring cufftGetErrorString proc~create->proc~cufftgeterrorstring proc~get_user_stream get_user_stream proc~create->proc~get_user_stream proc~create->mpi_abort proc~create_c2c dtfft_plan_c2c_t%create_c2c proc~create_c2c_internal dtfft_core_c2c%create_c2c_internal proc~create_c2c->proc~create_c2c_internal proc~create_c2c->proc~dtfft_get_error_string proc~create_c2c->proc~get_log_enabled proc~create_c2c->proc~pop_nvtx_domain_range proc~create_c2c->proc~push_nvtx_domain_range proc~create_c2c->proc~write_message proc~create_private dtfft_plan_t%create_private proc~create_c2c_internal->proc~create_private create create proc~create_c2c_internal->create fft_mapping fft_mapping proc~create_c2c_internal->fft_mapping pencils pencils proc~create_c2c_internal->pencils mpi_cart_create mpi_cart_create proc~create_cart_comm->mpi_cart_create mpi_cart_sub mpi_cart_sub proc~create_cart_comm->mpi_cart_sub proc~create_cuda transpose_plan_cuda%create_cuda proc~create_cuda->interface~int_to_str proc~create_cuda->proc~alloc_and_set_aux proc~create_cuda->proc~autotune_grid_decomposition proc~create_cuda->proc~clean_unused_cache proc~create_cuda->proc~create_cart_comm proc~create_cuda->proc~create~8 proc~create_cuda->proc~double_to_str proc~create_cuda->proc~dtfft_get_backend_string proc~create_cuda->proc~get_log_enabled proc~get_mpi_enabled get_mpi_enabled proc~create_cuda->proc~get_mpi_enabled proc~get_nccl_enabled get_nccl_enabled proc~create_cuda->proc~get_nccl_enabled proc~get_nvshmem_enabled get_nvshmem_enabled proc~create_cuda->proc~get_nvshmem_enabled proc~get_user_gpu_backend get_user_gpu_backend proc~create_cuda->proc~get_user_gpu_backend proc~create_cuda->proc~get_user_stream proc~create_cuda->proc~is_backend_nccl proc~load_cuda load_cuda proc~create_cuda->proc~load_cuda proc~load_nvrtc load_nvrtc proc~create_cuda->proc~load_nvrtc proc~create_cuda->proc~run_autotune_backend proc~create_cuda->proc~write_message proc~create_cuda->mpi_comm_size mpi_wtime mpi_wtime proc~create_cuda->mpi_wtime proc~create_data_handle data_handle%create_data_handle mpi_allgather mpi_allgather proc~create_data_handle->mpi_allgather proc~create_device_pointer create_device_pointer proc~create_device_pointer->interface~cudamalloc proc~create_device_pointer->interface~cudamemcpy proc~create_device_pointer->interface~int_to_str proc~create_device_pointer->proc~cudageterrorstring proc~create_device_pointer->mpi_abort proc~create_handle handle_t%create_handle proc~destroy_handle handle_t%destroy_handle proc~create_handle->proc~destroy_handle proc~create_helper backend_helper%create_helper proc~create_helper->interface~get_env proc~create_helper->interface~int_to_str proc~create_helper->interface~ncclcomminitrank proc~create_helper->interface~ncclgetuniqueid proc~destroy_helper backend_helper%destroy_helper proc~create_helper->proc~destroy_helper proc~create_helper->proc~ncclgeterrorstring proc~create_helper->mpi_abort proc~create_helper->mpi_allgather mpi_bcast mpi_bcast proc~create_helper->mpi_bcast proc~create_helper->mpi_comm_rank proc~create_helper->mpi_comm_size proc~create_helper~2 mpi_backend_helper%create_helper proc~create_mpi backend_mpi%create_mpi proc~create_mpi->proc~create_helper~2 proc~is_backend_mpi is_backend_mpi proc~create_mpi->proc~is_backend_mpi proc~create_nccl backend_nccl%create_nccl proc~create_nccl->proc~is_backend_nccl proc~create_nvtx_domain create_nvtx_domain proc~create_nvtx_domain->interface~nvtxdomaincreate_c proc~create_nvtx_domain->proc~astring_f2c proc~create_private->interface~cudagetdevice proc~create_private->interface~cudagetdevicecount proc~create_private->interface~int_to_str proc~create_private->proc~alloc_fft_plans proc~create_private->proc~check_create_args proc~create_private->proc~count_unique proc~create~4 abstract_transpose_plan%create proc~create_private->proc~create~4 proc~create_private->proc~cudageterrorstring proc~create_private->proc~get_user_gpu_backend proc~create_private->proc~get_user_stream local_devices local_devices proc~create_private->local_devices proc~create_private->mpi_abort proc~create_private->mpi_allgather proc~create_private->mpi_comm_free proc~create_private->mpi_comm_rank proc~create_private->mpi_comm_size mpi_comm_split_type mpi_comm_split_type proc~create_private->mpi_comm_split_type proc~create_private~2 transpose_plan_host%create_private proc~create_private~2->interface~int_to_str proc~create_private~2->proc~autotune_grid_decomposition~2 proc~create_private~2->proc~autotune_grid~2 proc~create_private~2->proc~create_cart_comm proc~create_private~2->proc~create~8 proc~get_datatype_from_env get_datatype_from_env proc~create_private~2->proc~get_datatype_from_env proc~create_private~2->proc~get_log_enabled proc~create_private~2->proc~write_message back_ids back_ids proc~create_private~2->back_ids dummy_decomp dummy_decomp proc~create_private~2->dummy_decomp dummy_timer dummy_timer proc~create_private~2->dummy_timer forw_ids forw_ids proc~create_private~2->forw_ids proc~create_private~2->mpi_comm_size proc~create_r2c dtfft_plan_r2c_t%create_r2c proc~create_r2c->proc~create_c2c_internal proc~create_r2c->proc~create~8 proc~create_r2c->proc~dtfft_get_error_string proc~create_r2c->proc~get_log_enabled proc~create_r2c->proc~pop_nvtx_domain_range proc~create_r2c->proc~push_nvtx_domain_range proc~create_r2c->proc~write_message proc~create_r2c->pencils proc~create_r2r dtfft_plan_r2r_t%create_r2r proc~create_r2r->proc~create_private proc~create_r2r->proc~dtfft_get_error_string proc~create_r2r->proc~get_log_enabled proc~create_r2r->proc~pop_nvtx_domain_range proc~create_r2r->proc~push_nvtx_domain_range proc~create_r2r->proc~write_message proc~create_r2r->create proc~create_r2r->fft_mapping proc~create_r2r->pencils proc~create_transpose_2d transpose_handle_host%create_transpose_2d proc~free_datatypes free_datatypes proc~create_transpose_2d->proc~free_datatypes mpi_type_commit mpi_type_commit proc~create_transpose_2d->mpi_type_commit mpi_type_contiguous mpi_type_contiguous proc~create_transpose_2d->mpi_type_contiguous mpi_type_create_resized mpi_type_create_resized proc~create_transpose_2d->mpi_type_create_resized mpi_type_vector mpi_type_vector proc~create_transpose_2d->mpi_type_vector proc~create_transpose_xy transpose_handle_host%create_transpose_XY proc~create_transpose_xy->proc~free_datatypes proc~create_transpose_xy->mpi_type_commit proc~create_transpose_xy->mpi_type_contiguous mpi_type_create_hvector mpi_type_create_hvector proc~create_transpose_xy->mpi_type_create_hvector proc~create_transpose_xy->mpi_type_create_resized proc~create_transpose_xy->mpi_type_vector proc~create_transpose_xz transpose_handle_host%create_transpose_XZ proc~create_transpose_xz->proc~free_datatypes proc~create_transpose_xz->mpi_type_commit proc~create_transpose_xz->mpi_type_contiguous proc~create_transpose_xz->mpi_type_create_hvector proc~create_transpose_xz->mpi_type_create_resized proc~create_transpose_xz->mpi_type_vector proc~create_transpose_yz transpose_handle_host%create_transpose_YZ proc~create_transpose_yz->proc~free_datatypes proc~create_transpose_yz->mpi_type_commit proc~create_transpose_yz->mpi_type_contiguous proc~create_transpose_yz->mpi_type_create_hvector proc~create_transpose_yz->mpi_type_create_resized proc~create_transpose_yz->mpi_type_vector proc~create_transpose_zx transpose_handle_host%create_transpose_ZX proc~create_transpose_zx->proc~free_datatypes proc~create_transpose_zx->mpi_type_commit proc~create_transpose_zx->mpi_type_contiguous proc~create_transpose_zx->mpi_type_create_hvector proc~create_transpose_zx->mpi_type_create_resized proc~create_transpose_zx->mpi_type_vector proc~create~10 backend_cufftmp%create proc~create~10->interface~comm_f2c proc~create~10->interface~cufftmpattachreshapecomm proc~create~10->interface~cufftmpcreatereshape proc~create~10->interface~cufftmpgetreshapesize proc~create~10->interface~cufftmpmakereshape proc~create~10->interface~int_to_str proc~create~10->proc~cufftgeterrorstring proc~create~10->mpi_abort proc~create~11 vkfft_executor%create proc~create~11->proc~get_user_platform proc~create~11->proc~get_user_stream proc~load_vkfft load_vkfft proc~create~11->proc~load_vkfft proc~create~12 transpose_handle_host%create proc~create~12->proc~create_handle proc~create~12->proc~create_transpose_2d proc~create~12->proc~create_transpose_xy proc~create~12->proc~create_transpose_xz proc~create~12->proc~create_transpose_yz proc~create~12->proc~create_transpose_zx proc~get_transpose_type get_transpose_type proc~create~12->proc~get_transpose_type proc~create~12->mpi_allgather proc~create~12->mpi_comm_size proc~create~2 nvrtc_kernel%create proc~create~2->proc~compile_and_cache proc~create~2->proc~create_device_pointer proc~destroy~3 nvrtc_kernel%destroy proc~create~2->proc~destroy~3 proc~get_contiguous_execution_blocks get_contiguous_execution_blocks proc~create~2->proc~get_contiguous_execution_blocks proc~get_tile_size get_tile_size proc~create~2->proc~get_tile_size proc~create~2->mpi_comm_rank proc~create~2->mpi_comm_size proc~create~3 mkl_executor%create proc~make_plan make_plan proc~create~3->proc~make_plan proc~create~4->interface~int_to_str proc~create~4->proc~get_local_sizes~2 proc~create~4->proc~get_log_enabled proc~create~4->proc~get_user_platform proc~get_z_slab get_z_slab proc~create~4->proc~get_z_slab proc~create~4->proc~write_message create_private create_private proc~create~4->create_private mpi_cart_get mpi_cart_get proc~create~4->mpi_cart_get mpi_cartdim_get mpi_cartdim_get proc~create~4->mpi_cartdim_get proc~create~4->mpi_comm_size mpi_dims_create mpi_dims_create proc~create~4->mpi_dims_create proc~create~4->mpi_topo_test temp_coords temp_coords proc~create~4->temp_coords temp_dims temp_dims proc~create~4->temp_dims temp_periods temp_periods proc~create~4->temp_periods proc~create~5 abstract_executor%create proc~create~5->proc~pop_nvtx_domain_range proc~create~5->proc~push_nvtx_domain_range proc~create~5->create_private proc~create~5->is_null_ptr proc~create~6 abstract_backend%create proc~create~6->interface~cudaeventcreatewithflags proc~create~6->interface~cudastreamcreate proc~create~6->interface~int_to_str proc~create~6->proc~cudageterrorstring proc~create~6->proc~is_backend_mpi proc~is_backend_pipelined is_backend_pipelined proc~create~6->proc~is_backend_pipelined proc~create~6->create_private proc~create~6->mpi_abort proc~create~6->mpi_comm_rank proc~create~6->mpi_comm_size proc~create~7 transpose_handle_cuda%create proc~create~7->proc~create~2 proc~destroy_data_handle data_handle%destroy_data_handle proc~create~7->proc~destroy_data_handle proc~create~7->proc~get_transpose_type proc~create~7->proc~is_backend_mpi proc~create~7->proc~is_backend_nccl proc~create~7->proc~is_backend_pipelined proc~set_unpack_kernel abstract_backend%set_unpack_kernel proc~create~7->proc~set_unpack_kernel proc~create~7->mpi_comm_rank proc~create~7->mpi_comm_size mpi_irecv mpi_irecv proc~create~7->mpi_irecv mpi_isend mpi_isend proc~create~7->mpi_isend mpi_wait mpi_wait proc~create~7->mpi_wait proc~create~8->proc~destroy~8 proc~get_local_size get_local_size proc~create~8->proc~get_local_size proc~create~9 fftw_executor%create proc~get_inverse_kind get_inverse_kind proc~create~9->proc~get_inverse_kind constructor constructor proc~create~9->constructor constructor_inverse constructor_inverse proc~create~9->constructor_inverse inverse_kinds inverse_kinds proc~create~9->inverse_kinds knds knds proc~create~9->knds proc~cudageterrorstring->interface~cudageterrorstring_c proc~cudageterrorstring->proc~string_c2f proc~culaunchkernel cuLaunchKernel proc~culaunchkernel->interface~run_cuda_kernel proc~destoy_helper mpi_backend_helper%destoy_helper mpi_request_free mpi_request_free proc~destoy_helper->mpi_request_free proc~destroy dtfft_plan_t%destroy proc~destroy->proc~clean_unused_cache proc~destroy_stream destroy_stream proc~destroy->proc~destroy_stream proc~destroy->proc~dtfft_get_error_string proc~destroy->proc~get_log_enabled proc~mem_free_ptr dtfft_plan_t%mem_free_ptr proc~destroy->proc~mem_free_ptr proc~destroy->proc~pop_nvtx_domain_range proc~destroy->proc~push_nvtx_domain_range proc~destroy->proc~write_message destroy destroy proc~destroy->destroy proc~destroy->mpi_comm_free mpi_finalized mpi_finalized proc~destroy->mpi_finalized proc~destroy_cuda transpose_plan_cuda%destroy_cuda proc~destroy~7 transpose_handle_cuda%destroy proc~destroy_cuda->proc~destroy~7 proc~mem_free~3 abstract_transpose_plan%mem_free proc~destroy_cuda->proc~mem_free~3 mpi_type_free mpi_type_free proc~destroy_handle->mpi_type_free proc~destroy_helper->interface~int_to_str proc~destroy_helper->interface~ncclcommdestroy proc~destroy_helper->proc~get_log_enabled proc~destroy_helper->proc~ncclgeterrorstring proc~destroy_helper->proc~write_message proc~destroy_helper->mpi_abort proc~destroy_mpi backend_mpi%destroy_mpi proc~destroy_mpi->proc~destoy_helper proc~destroy_nccl backend_nccl%destroy_nccl proc~destroy_pencil_t destroy_pencil_t proc~destroy_stream->interface~cudastreamdestroy proc~destroy_stream->interface~int_to_str proc~destroy_stream->proc~cudageterrorstring proc~destroy_stream->mpi_abort proc~destroy_strings destroy_strings proc~destroy~10 backend_cufftmp%destroy proc~destroy~10->interface~cufftmpdestroyreshape proc~destroy~10->interface~int_to_str proc~destroy~10->proc~cufftgeterrorstring proc~destroy~10->mpi_abort proc~destroy~11 vkfft_executor%destroy proc~destroy~12 transpose_plan_host%destroy proc~destroy~13 transpose_handle_host%destroy proc~destroy~12->proc~destroy~13 proc~destroy~13->proc~destroy_handle proc~destroy~13->mpi_request_free proc~destroy~2 cufft_executor%destroy proc~destroy~2->interface~cufftdestroy proc~destroy~2->interface~int_to_str proc~destroy~2->proc~cufftgeterrorstring proc~destroy~2->mpi_abort proc~destroy~3->interface~cudafree proc~destroy~3->interface~int_to_str proc~destroy~3->proc~cudageterrorstring proc~mark_unused mark_unused proc~destroy~3->proc~mark_unused proc~destroy~3->mpi_abort proc~destroy~4 mkl_executor%destroy proc~destroy~4->interface~int_to_str proc~destroy~4->interface~mkl_dfti_free_desc proc~dftierrormessage DftiErrorMessage proc~destroy~4->proc~dftierrormessage proc~destroy~4->mpi_abort proc~destroy~5 abstract_executor%destroy destroy_private destroy_private proc~destroy~5->destroy_private proc~destroy~6 abstract_backend%destroy proc~destroy~6->interface~cudaeventdestroy proc~destroy~6->interface~cudastreamdestroy proc~destroy~6->interface~int_to_str proc~destroy~6->proc~cudageterrorstring proc~destroy~6->destroy_private proc~destroy~6->mpi_abort proc~destroy~7->proc~destroy~3 proc~destroy~9 fftw_executor%destroy proc~dftierrormessage->interface~dftierrormessage_c proc~dftierrormessage->proc~string_c2f proc~dl_error dl_error proc~dl_error->interface~dlerror proc~dl_error->proc~get_log_enabled proc~dl_error->proc~string_c2f proc~dl_error->proc~write_message proc~dtfft_create_config dtfft_create_config proc~dtfft_create_plan_c2c_c dtfft_create_plan_c2c_c proc~get_comm get_comm proc~dtfft_create_plan_c2c_c->proc~get_comm proc~dtfft_create_plan_c2c_c->create proc~dtfft_create_plan_r2r_c dtfft_create_plan_r2r_c proc~dtfft_create_plan_r2r_c->proc~get_comm proc~dtfft_create_plan_r2r_c->create proc~dtfft_destroy_c dtfft_destroy_c proc~dtfft_destroy_c->proc~destroy proc~dtfft_destroy_c->is_null_ptr proc~dtfft_execute_c dtfft_execute_c proc~execute_ptr dtfft_plan_t%execute_ptr proc~dtfft_execute_c->proc~execute_ptr proc~dtfft_execute_c->is_null_ptr proc~dtfft_get_alloc_bytes_c dtfft_get_alloc_bytes_c proc~get_alloc_bytes dtfft_plan_t%get_alloc_bytes proc~dtfft_get_alloc_bytes_c->proc~get_alloc_bytes proc~dtfft_get_alloc_bytes_c->is_null_ptr proc~dtfft_get_alloc_size_c dtfft_get_alloc_size_c proc~dtfft_get_alloc_size_c->proc~get_alloc_size proc~dtfft_get_alloc_size_c->is_null_ptr proc~dtfft_get_backend_c dtfft_get_backend_c proc~get_backend dtfft_plan_t%get_backend proc~dtfft_get_backend_c->proc~get_backend proc~dtfft_get_backend_c->is_null_ptr proc~dtfft_get_backend_string_c dtfft_get_backend_string_c proc~dtfft_get_backend_string_c->proc~dtfft_get_backend_string proc~dtfft_get_backend_string_c->proc~string_f2c proc~dtfft_get_cuda_stream dtfft_get_cuda_stream proc~dtfft_get_element_size_c dtfft_get_element_size_c proc~dtfft_get_element_size_c->proc~get_element_size proc~dtfft_get_element_size_c->is_null_ptr proc~dtfft_get_error_string_c dtfft_get_error_string_c proc~dtfft_get_error_string_c->proc~dtfft_get_error_string proc~dtfft_get_error_string_c->proc~string_f2c proc~dtfft_get_local_sizes_c dtfft_get_local_sizes_c proc~get_local_sizes dtfft_plan_t%get_local_sizes proc~dtfft_get_local_sizes_c->proc~get_local_sizes proc~dtfft_get_local_sizes_c->is_null_ptr proc~dtfft_get_pencil_c dtfft_get_pencil_c proc~get_pencil dtfft_plan_t%get_pencil proc~dtfft_get_pencil_c->proc~get_pencil proc~dtfft_get_pencil_c->is_null_ptr proc~dtfft_get_platform_c dtfft_get_platform_c proc~get_platform dtfft_plan_t%get_platform proc~dtfft_get_platform_c->proc~get_platform proc~dtfft_get_platform_c->is_null_ptr proc~dtfft_get_stream_c dtfft_get_stream_c proc~dtfft_get_stream_c->none~get_stream proc~dtfft_get_stream_c->is_null_ptr proc~dtfft_get_z_slab_enabled_c dtfft_get_z_slab_enabled_c proc~get_z_slab_enabled dtfft_plan_t%get_z_slab_enabled proc~dtfft_get_z_slab_enabled_c->proc~get_z_slab_enabled proc~dtfft_get_z_slab_enabled_c->is_null_ptr proc~dtfft_mem_alloc_c dtfft_mem_alloc_c proc~dtfft_mem_alloc_c->proc~mem_alloc_ptr proc~dtfft_mem_alloc_c->is_null_ptr proc~dtfft_mem_free_c dtfft_mem_free_c proc~dtfft_mem_free_c->proc~mem_free_ptr proc~dtfft_mem_free_c->is_null_ptr proc~dtfft_report_c dtfft_report_c proc~report dtfft_plan_t%report proc~dtfft_report_c->proc~report proc~dtfft_report_c->is_null_ptr proc~dtfft_set_config dtfft_set_config proc~dtfft_set_config->interface~cudastreamquery proc~is_valid_gpu_backend is_valid_gpu_backend proc~dtfft_set_config->proc~is_valid_gpu_backend proc~is_valid_platform is_valid_platform proc~dtfft_set_config->proc~is_valid_platform proc~dtfft_set_config->is_null_ptr proc~dtfft_set_config_c dtfft_set_config_c proc~dtfft_set_config_c->proc~dtfft_set_config proc~dtfft_transpose_c dtfft_transpose_c proc~transpose_ptr dtfft_plan_t%transpose_ptr proc~dtfft_transpose_c->proc~transpose_ptr proc~dtfft_transpose_c->is_null_ptr proc~dynamic_load dynamic_load proc~dynamic_load->interface~is_null_ptr proc~load_library load_library proc~dynamic_load->proc~load_library proc~load_symbol load_symbol proc~dynamic_load->proc~load_symbol proc~unload_library unload_library proc~dynamic_load->proc~unload_library proc~execute dtfft_plan_t%execute proc~execute->proc~execute_ptr proc~execute_cuda transpose_plan_cuda%execute_cuda proc~execute~8 transpose_handle_cuda%execute proc~execute_cuda->proc~execute~8 proc~execute_mpi backend_mpi%execute_mpi proc~execute_mpi->interface~cudastreamsynchronize proc~execute_mpi->interface~int_to_str proc~execute_mpi->proc~cudageterrorstring proc~execute~3 nvrtc_kernel%execute proc~execute_mpi->proc~execute~3 proc~run_mpi_a2a run_mpi_a2a proc~execute_mpi->proc~run_mpi_a2a proc~run_mpi_p2p run_mpi_p2p proc~execute_mpi->proc~run_mpi_p2p proc~execute_mpi->mpi_abort proc~execute_mpi->mpi_wait mpi_waitall mpi_waitall proc~execute_mpi->mpi_waitall proc~execute_nccl backend_nccl%execute_nccl proc~execute_nccl->interface~int_to_str proc~execute_nccl->interface~ncclgroupend proc~execute_nccl->interface~ncclgroupstart proc~execute_nccl->interface~ncclrecv proc~execute_nccl->interface~ncclsend proc~execute_nccl->proc~execute~3 proc~execute_nccl->proc~ncclgeterrorstring proc~execute_nccl->mpi_abort proc~execute_private dtfft_plan_t%execute_private proc~execute~5 abstract_transpose_plan%execute proc~execute_private->proc~execute~5 proc~execute_private~2 transpose_plan_host%execute_private proc~execute~12 transpose_handle_host%execute proc~execute_private~2->proc~execute~12 proc~execute_ptr->proc~check_aux proc~execute_ptr->proc~check_device_pointers proc~execute_ptr->proc~dtfft_get_error_string proc~execute_ptr->proc~execute_private proc~get_backend~2 abstract_transpose_plan%get_backend proc~execute_ptr->proc~get_backend~2 proc~execute_ptr->proc~get_log_enabled proc~is_same_ptr is_same_ptr proc~execute_ptr->proc~is_same_ptr proc~is_valid_execute_type is_valid_execute_type proc~execute_ptr->proc~is_valid_execute_type proc~execute_ptr->proc~pop_nvtx_domain_range proc~execute_ptr->proc~push_nvtx_domain_range proc~execute_ptr->proc~write_message proc~execute_ptr->is_null_ptr proc~execute~10 backend_cufftmp%execute proc~execute~10->interface~cufftmpexecreshapeasync proc~execute~10->interface~int_to_str proc~execute~10->interface~nvshmemx_sync_all_on_stream proc~execute~10->proc~cufftgeterrorstring proc~execute~10->mpi_abort proc~execute~11 vkfft_executor%execute mpi_alltoall_init mpi_alltoall_init proc~execute~12->mpi_alltoall_init mpi_alltoallw_init mpi_alltoallw_init proc~execute~12->mpi_alltoallw_init mpi_start mpi_start proc~execute~12->mpi_start proc~execute~12->mpi_wait proc~execute~2 cufft_executor%execute proc~execute~2->interface~cufftxtexec proc~execute~2->interface~int_to_str proc~execute~2->proc~cufftgeterrorstring proc~execute~2->mpi_abort proc~execute~3->interface~cudamemcpyasync proc~execute~3->interface~int_to_str proc~execute~3->proc~cudageterrorstring proc~execute~3->proc~culaunchkernel proc~execute~3->proc~get_contiguous_execution_blocks proc~execute~3->mpi_abort proc~execute~3->mpi_comm_rank proc~execute~4 mkl_executor%execute proc~execute~4->interface~int_to_str proc~execute~4->interface~mkl_dfti_commit_desc proc~execute~4->interface~mkl_dfti_execute proc~execute~4->interface~mkl_dfti_set_value proc~execute~4->proc~dftierrormessage proc~execute~4->mpi_abort proc~execute~5->proc~pop_nvtx_domain_range proc~execute~5->proc~push_nvtx_domain_range execute_private execute_private proc~execute~5->execute_private proc~execute~6 abstract_executor%execute proc~execute~6->proc~pop_nvtx_domain_range proc~execute~6->proc~push_nvtx_domain_range proc~execute~6->execute_private proc~execute~7 abstract_backend%execute proc~execute~7->interface~cudaeventrecord proc~execute~7->interface~cudamemcpyasync proc~execute~7->interface~cudastreamwaitevent proc~execute~7->interface~int_to_str proc~execute~7->proc~cudageterrorstring proc~execute~7->proc~execute~3 proc~execute~7->execute_private proc~execute~7->mpi_abort proc~execute~8->proc~execute~3 proc~execute~9 fftw_executor%execute proc~free_datatypes->mpi_type_free proc~free_mem free_mem proc~free_mem->interface~cudafree proc~free_mem->interface~int_to_str proc~free_mem->interface~ncclcommderegister proc~free_mem->interface~ncclmemfree proc~free_mem->interface~nvshmem_free proc~free_mem->proc~is_backend_nccl proc~free_mem->proc~is_backend_nvshmem proc~free_mem->proc~is_same_ptr proc~free_mem->proc~ncclgeterrorstring proc~free_mem->mpi_abort proc~get_alloc_bytes->proc~dtfft_get_error_string proc~get_alloc_bytes->proc~get_alloc_size proc~get_alloc_bytes->proc~get_element_size proc~get_alloc_bytes->proc~get_log_enabled proc~get_alloc_bytes->proc~write_message proc~get_alloc_size->proc~get_local_sizes proc~get_aux_size abstract_backend%get_aux_size proc~get_aux_size~2->proc~get_aux_size proc~get_backend->proc~get_backend~2 proc~get_backend_from_env get_backend_from_env proc~get_cached_kernel->proc~get_true_transpose_type proc~get_code_init get_code_init proc~get_code_init->interface~int_to_str proc~get_code_init->proc~add_line proc~get_datatype_from_env->interface~get_env proc~get_element_size->proc~dtfft_get_error_string proc~get_element_size->proc~get_log_enabled proc~get_element_size->proc~write_message proc~get_env_int32->interface~get_env proc~get_env_int32->proc~get_log_enabled proc~get_env_int32->proc~write_message proc~get_env_int8->interface~get_env proc~get_env_logical->interface~get_env proc~get_env_string->interface~get_env proc~get_env_string->proc~get_log_enabled proc~get_env_string->proc~write_message proc~get_iters_from_env get_iters_from_env proc~get_iters_from_env->interface~get_env proc~get_local_size->mpi_allgather proc~get_local_size->mpi_comm_rank proc~get_local_size->mpi_comm_size proc~get_local_sizes->proc~dtfft_get_error_string proc~get_local_sizes->proc~get_backend proc~get_local_sizes->proc~get_local_sizes~2 proc~get_local_sizes->proc~get_log_enabled proc~get_local_sizes->proc~is_backend_nvshmem proc~get_local_sizes->proc~write_message counts counts proc~get_local_sizes->counts proc~get_local_sizes->mpi_allreduce starts starts proc~get_local_sizes->starts proc~get_mpi_enabled_from_env get_mpi_enabled_from_env proc~get_mpi_enabled->proc~get_mpi_enabled_from_env proc~get_nccl_enabled_from_env get_nccl_enabled_from_env proc~get_nccl_enabled->proc~get_nccl_enabled_from_env proc~get_neighbor_function_code get_neighbor_function_code proc~get_neighbor_function_code->proc~add_line proc~get_nvshmem_enabled_from_env get_nvshmem_enabled_from_env proc~get_nvshmem_enabled->proc~get_nvshmem_enabled_from_env proc~get_pencil->proc~dtfft_get_error_string proc~get_pencil->proc~get_log_enabled proc~get_pencil->proc~write_message make_public make_public proc~get_pencil->make_public proc~get_pipe_enabled_from_env get_pipe_enabled_from_env proc~get_pipelined_enabled get_pipelined_enabled proc~get_pipelined_enabled->proc~get_pipe_enabled_from_env proc~get_plan_execution_time->interface~int_to_str proc~get_plan_execution_time->proc~create~12 proc~get_plan_execution_time->proc~destroy~13 proc~get_plan_execution_time->proc~double_to_str proc~get_plan_execution_time->proc~execute~12 proc~get_plan_execution_time->proc~get_iters_from_env proc~get_plan_execution_time->proc~get_log_enabled proc~get_plan_execution_time->proc~pop_nvtx_domain_range proc~get_plan_execution_time->proc~push_nvtx_domain_range proc~get_plan_execution_time->proc~write_message proc~get_plan_execution_time->mpi_allreduce proc~get_plan_execution_time->mpi_comm_size proc~get_plan_execution_time->mpi_wtime proc~get_platform_from_env get_platform_from_env proc~get_stream_int64->none~get_stream proc~get_stream_int64->proc~dtfft_get_cuda_stream proc~get_stream_ptr->proc~dtfft_get_error_string proc~get_stream_ptr->proc~get_log_enabled proc~get_stream_ptr->proc~write_message proc~get_tranpose_type transpose_handle_cuda%get_tranpose_type proc~get_transpose_kernel_code->proc~add_line proc~get_transpose_kernel_code->proc~get_code_init proc~get_transpose_kernel_code->proc~get_neighbor_function_code proc~get_unpack_kernel_code->proc~add_line proc~get_unpack_kernel_code->proc~get_code_init proc~get_unpack_kernel_code->proc~get_neighbor_function_code proc~get_unpack_pipelined_kernel_code->proc~add_line proc~get_unpack_pipelined_kernel_code->proc~get_code_init proc~get_unpack_pipelined_kernel_code->mpi_comm_rank proc~get_user_gpu_backend->proc~get_backend_from_env proc~get_user_platform->proc~get_platform_from_env proc~get_user_stream->interface~cudastreamcreate proc~get_user_stream->interface~int_to_str proc~get_user_stream->proc~cudageterrorstring proc~get_user_stream->mpi_abort proc~get_z_slab_from_env get_z_slab_from_env proc~get_z_slab->proc~get_z_slab_from_env proc~init_internal->interface~get_env proc~init_internal->proc~destroy_strings backends backends proc~init_internal->backends mpi_initialized mpi_initialized proc~init_internal->mpi_initialized platforms platforms proc~init_internal->platforms proc~is_null_ptr is_null_ptr proc~is_nvshmem_ptr->interface~nvshmem_my_pe proc~is_nvshmem_ptr->interface~nvshmem_ptr proc~is_nvshmem_ptr->is_null_ptr proc~is_valid_transpose_type is_valid_transpose_type proc~load load proc~load->proc~destroy_strings proc~load->proc~dynamic_load proc~load_cuda->proc~destroy_strings proc~load_cuda->proc~dynamic_load proc~load_library->interface~dlopen proc~load_library->interface~is_null_ptr proc~load_library->proc~astring_f2c proc~load_library->proc~dl_error proc~load_nvrtc->proc~destroy_strings proc~load_nvrtc->proc~dynamic_load proc~load_symbol->interface~dlsym proc~load_symbol->interface~is_null_ptr proc~load_symbol->proc~astring_f2c proc~load_symbol->proc~dl_error proc~load_vkfft->proc~load proc~make_plan->interface~int_to_str proc~make_plan->interface~mkl_dfti_commit_desc proc~make_plan->interface~mkl_dfti_create_desc proc~make_plan->interface~mkl_dfti_set_value proc~make_plan->proc~dftierrormessage proc~make_plan->mpi_abort proc~make_public pencil%make_public proc~mark_unused->proc~is_same_ptr proc~mem_alloc cufft_executor%mem_alloc proc~mem_alloc_c32_1d->proc~mem_alloc_ptr proc~mem_alloc_c32_2d->proc~mem_alloc_ptr proc~mem_alloc_c32_3d->proc~mem_alloc_ptr proc~mem_alloc_c64_1d->proc~mem_alloc_ptr proc~mem_alloc_c64_2d->proc~mem_alloc_ptr proc~mem_alloc_c64_3d->proc~mem_alloc_ptr proc~mem_alloc_ptr->interface~mem_alloc_host proc~mem_alloc_ptr->proc~dtfft_get_error_string proc~mem_alloc_ptr->proc~get_log_enabled proc~mem_alloc_ptr->proc~write_message proc~mem_alloc_ptr->is_null_ptr mem_alloc mem_alloc proc~mem_alloc_ptr->mem_alloc proc~mem_alloc_r32_1d->proc~mem_alloc_ptr proc~mem_alloc_r32_2d->proc~mem_alloc_ptr proc~mem_alloc_r32_3d->proc~mem_alloc_ptr proc~mem_alloc_r64_1d->proc~mem_alloc_ptr proc~mem_alloc_r64_2d->proc~mem_alloc_ptr proc~mem_alloc_r64_3d->proc~mem_alloc_ptr proc~mem_alloc~2 mkl_executor%mem_alloc proc~mem_alloc~2->interface~int_to_str proc~mem_alloc~2->interface~mkl_dfti_mem_alloc proc~mem_alloc~2->proc~dftierrormessage proc~mem_alloc~2->mpi_abort proc~mem_alloc~3 abstract_transpose_plan%mem_alloc proc~mem_alloc~3->proc~alloc_mem proc~mem_alloc~4 fftw_executor%mem_alloc fftw_malloc fftw_malloc proc~mem_alloc~4->fftw_malloc proc~mem_alloc~5 vkfft_executor%mem_alloc proc~mem_free cufft_executor%mem_free proc~mem_free_c32_1d->proc~mem_free_ptr proc~mem_free_c32_2d->proc~mem_free_ptr proc~mem_free_c32_3d->proc~mem_free_ptr proc~mem_free_c64_1d->proc~mem_free_ptr proc~mem_free_c64_2d->proc~mem_free_ptr proc~mem_free_c64_3d->proc~mem_free_ptr proc~mem_free_ptr->interface~mem_free_host proc~mem_free_ptr->proc~dtfft_get_error_string proc~mem_free_ptr->proc~get_log_enabled proc~mem_free_ptr->proc~write_message mem_free mem_free proc~mem_free_ptr->mem_free proc~mem_free_r32_1d->proc~mem_free_ptr proc~mem_free_r32_2d->proc~mem_free_ptr proc~mem_free_r32_3d->proc~mem_free_ptr proc~mem_free_r64_1d->proc~mem_free_ptr proc~mem_free_r64_2d->proc~mem_free_ptr proc~mem_free_r64_3d->proc~mem_free_ptr proc~mem_free~2 mkl_executor%mem_free proc~mem_free~2->interface~int_to_str proc~mem_free~2->interface~mkl_dfti_mem_free proc~mem_free~2->proc~dftierrormessage proc~mem_free~2->mpi_abort proc~mem_free~3->proc~free_mem proc~mem_free~4 fftw_executor%mem_free fftw_free fftw_free proc~mem_free~4->fftw_free proc~mem_free~5 vkfft_executor%mem_free proc~ncclgeterrorstring->interface~ncclgeterrorstring_c proc~ncclgeterrorstring->proc~string_c2f proc~nvrtcgeterrorstring->proc~string_c2f proc~pop_nvtx_domain_range->interface~nvtxdomainrangepop_c proc~push_nvtx_domain_range->interface~nvtxdomainrangepushex_c proc~push_nvtx_domain_range->proc~astring_f2c proc~push_nvtx_domain_range->proc~create_nvtx_domain proc~report->interface~dtfft_get_version proc~report->interface~int_to_str proc~report->proc~dtfft_get_backend_string proc~report->proc~dtfft_get_error_string proc~report->proc~get_backend~2 proc~report->proc~get_log_enabled proc~report->proc~write_message proc~report->mpi_cart_get proc~run_autotune_backend->interface~cudaeventcreate proc~run_autotune_backend->interface~cudaeventdestroy proc~run_autotune_backend->interface~cudaeventelapsedtime proc~run_autotune_backend->interface~cudaeventrecord proc~run_autotune_backend->interface~cudaeventsynchronize proc~run_autotune_backend->interface~cudastreamsynchronize proc~run_autotune_backend->interface~int_to_str proc~run_autotune_backend->proc~alloc_and_set_aux proc~run_autotune_backend->proc~alloc_mem proc~run_autotune_backend->proc~create_helper proc~run_autotune_backend->proc~cudageterrorstring proc~run_autotune_backend->proc~destroy~7 proc~run_autotune_backend->proc~double_to_str proc~run_autotune_backend->proc~dtfft_get_backend_string proc~run_autotune_backend->proc~dtfft_get_error_string proc~run_autotune_backend->proc~execute~8 proc~run_autotune_backend->proc~free_mem proc~run_autotune_backend->proc~get_iters_from_env proc~run_autotune_backend->proc~get_local_sizes~2 proc~run_autotune_backend->proc~get_log_enabled proc~run_autotune_backend->proc~get_mpi_enabled proc~run_autotune_backend->proc~get_nvshmem_enabled proc~run_autotune_backend->proc~get_pipelined_enabled proc~run_autotune_backend->proc~is_backend_mpi proc~run_autotune_backend->proc~is_backend_nccl proc~run_autotune_backend->proc~is_backend_nvshmem proc~run_autotune_backend->proc~is_backend_pipelined proc~run_autotune_backend->proc~pop_nvtx_domain_range proc~run_autotune_backend->proc~push_nvtx_domain_range proc~run_autotune_backend->proc~write_message proc~run_autotune_backend->mpi_abort proc~run_autotune_backend->mpi_allreduce mpi_barrier mpi_barrier proc~run_autotune_backend->mpi_barrier proc~run_autotune_backend->mpi_comm_size mpi_alltoallv_init mpi_alltoallv_init proc~run_mpi_a2a->mpi_alltoallv_init proc~run_mpi_a2a->mpi_start proc~run_mpi_p2p->mpi_comm_size mpi_recv_init mpi_recv_init proc~run_mpi_p2p->mpi_recv_init mpi_send_init mpi_send_init proc~run_mpi_p2p->mpi_send_init mpi_startall mpi_startall proc~run_mpi_p2p->mpi_startall proc~to_cstr->proc~astring_f2c proc~transpose dtfft_plan_t%transpose proc~transpose->proc~transpose_ptr proc~transpose_ptr->proc~check_device_pointers proc~transpose_ptr->proc~dtfft_get_error_string proc~transpose_ptr->proc~execute~5 proc~transpose_ptr->proc~get_backend~2 proc~transpose_ptr->proc~get_log_enabled proc~transpose_ptr->proc~is_same_ptr proc~transpose_ptr->proc~is_valid_transpose_type proc~transpose_ptr->proc~pop_nvtx_domain_range proc~transpose_ptr->proc~push_nvtx_domain_range proc~transpose_ptr->proc~write_message proc~unload_library->interface~dlclose proc~unload_library->proc~dl_error proc~write_message->mpi_comm_rank proc~write_message->mpi_finalized
Help