Device kernel class
| Type | Visibility | Attributes | Name | Initial | |||
|---|---|---|---|---|---|---|---|
| logical, | public | :: | is_created | = | .false. |
Kernel is created flag. |
|
| logical, | public | :: | is_dummy | = | .false. |
If kernel should do anything or not. |
|
| type(kernel_type_t), | public | :: | kernel_type |
Type of the kernel |
|||
| character(len=:), | public, | allocatable | :: | kernel_string | |||
| integer(kind=int32), | public, | allocatable | :: | neighbor_data(:,:) |
Neighbor data for pipelined unpacking |
||
| integer(kind=int32), | public, | allocatable | :: | dims(:) |
Local dimensions to process |
||
| type(kernel_type_t), | private | :: | internal_kernel_type |
Actual kernel type used for execution, can be different from |
|||
| type(CUfunction), | private | :: | cuda_kernel |
Pointer to CUDA kernel. |
|||
| integer(kind=int32), | private | :: | tile_size |
Tile size used for this kernel |
|||
| integer(kind=int32), | private | :: | block_rows |
Number of rows in each block processed by each thread |
|||
| integer(kind=int64), | private | :: | copy_bytes |
Number of bytes to copy for |
Creates kernel
Creates kernel
| Type | Intent | Optional | Attributes | Name | ||
|---|---|---|---|---|---|---|
| class(abstract_kernel), | intent(inout) | :: | self |
Abstract kernel |
||
| integer(kind=int32), | intent(in) | :: | dims(:) |
Local dimensions to process |
||
| type(dtfft_effort_t), | intent(in) | :: | effort |
Effort level for generating transpose kernels |
||
| integer(kind=int64), | intent(in) | :: | base_storage |
Number of bytes needed to store single element |
||
| type(kernel_type_t), | intent(in) | :: | kernel_type |
Type of kernel to build |
||
| integer(kind=int32), | intent(in), | optional | :: | neighbor_data(:,:) |
Optional pointers for unpack kernels |
|
| logical, | intent(in), | optional | :: | force_effort |
Should effort be forced or not |
Executes kernel
Executes kernel
| Type | Intent | Optional | Attributes | Name | ||
|---|---|---|---|---|---|---|
| class(abstract_kernel), | intent(inout) | :: | self |
Abstract kernel |
||
| real(kind=real32), | intent(in) | :: | in(:) |
Source buffer, can be device or host pointer |
||
| real(kind=real32), | intent(inout) | :: | out(:) |
Target buffer, can be device or host pointer |
||
| type(dtfft_stream_t), | intent(in) | :: | stream |
Stream to execute on, used only for device pointers |
||
| integer(kind=int32), | intent(in), | optional | :: | neighbor |
Source rank for pipelined unpacking |
Destroys kernel
Destroys kernel
| Type | Intent | Optional | Attributes | Name | ||
|---|---|---|---|---|---|---|
| class(abstract_kernel), | intent(inout) | :: | self |
Abstract kernel |
Creates kernel
Creates kernel
| Type | Intent | Optional | Attributes | Name | ||
|---|---|---|---|---|---|---|
| class(kernel_device), | intent(inout) | :: | self |
Device kernel class |
||
| type(dtfft_effort_t), | intent(in) | :: | effort |
Effort level for generating transpose kernels |
||
| integer(kind=int64), | intent(in) | :: | base_storage |
Number of bytes needed to store single element |
||
| logical, | intent(in), | optional | :: | force_effort |
Should effort be forced or not |
Executes kernel
Executes kernel on stream
| Type | Intent | Optional | Attributes | Name | ||
|---|---|---|---|---|---|---|
| class(kernel_device), | intent(inout) | :: | self |
Device kernel class |
||
| real(kind=real32), | intent(in), | target | :: | in(:) |
Device pointer |
|
| real(kind=real32), | intent(inout), | target | :: | out(:) |
Device pointer |
|
| type(dtfft_stream_t), | intent(in) | :: | stream |
Stream to execute on |
||
| integer(kind=int32), | intent(in), | optional | :: | neighbor |
Source rank for pipelined unpacking |
Destroys kernel
Destroys kernel
| Type | Intent | Optional | Attributes | Name | ||
|---|---|---|---|---|---|---|
| class(kernel_device), | intent(inout) | :: | self |
Device kernel class |