run_cuda_kernel Interface

interface

Launches a CUDA function CUfunction or a CUDA kernel CUkernel.


Called by

interface~~run_cuda_kernel~~CalledByGraph interface~run_cuda_kernel run_cuda_kernel proc~culaunchkernel cuLaunchKernel proc~culaunchkernel->interface~run_cuda_kernel proc~execute~3 nvrtc_kernel%execute proc~execute~3->proc~culaunchkernel proc~execute_mpi backend_mpi%execute_mpi proc~execute_mpi->proc~execute~3 proc~execute_nccl backend_nccl%execute_nccl proc~execute_nccl->proc~execute~3 proc~execute~7 abstract_backend%execute proc~execute~7->proc~execute~3 proc~execute~8 transpose_handle_cuda%execute proc~execute~8->proc~execute~3 proc~execute_cuda transpose_plan_cuda%execute_cuda proc~execute_cuda->proc~execute~8 proc~run_autotune_backend run_autotune_backend proc~run_autotune_backend->proc~execute~8 proc~autotune_grid autotune_grid proc~autotune_grid->proc~run_autotune_backend proc~create_cuda transpose_plan_cuda%create_cuda proc~create_cuda->proc~run_autotune_backend proc~autotune_grid_decomposition autotune_grid_decomposition proc~create_cuda->proc~autotune_grid_decomposition proc~autotune_grid_decomposition->proc~autotune_grid

private function run_cuda_kernel(func, in, out, blocks, threads, stream, args, funptr) result(cuResult) bind(C, name="run_cuda_kernel")

Arguments

Type IntentOptional Attributes Name
type(CUfunction), value :: func

Function CUfunction or Kernel CUkernel to launch

type(c_ptr), value :: in

Input pointer

type(c_ptr), value :: out

Output pointer

type(dim3) :: blocks

Grid in blocks

type(dim3) :: threads

Thread block

type(dtfft_stream_t), value :: stream

Stream identifier

type(kernelArgs) :: args

Kernel parameters

type(c_funptr), value :: funptr

Pointer to cuLaunchKernel

Return Value integer(kind=c_int)

Driver result code

Description

Wrapper around cuLaunchKernel, since I have to idea how to pass array of pointers to cuLaunchKernel.

Launches a CUDA function CUfunction or a CUDA kernel CUkernel.