Generates code that will be used to locally tranpose data and prepares to send it to other processes ndims == 2
Type | Intent | Optional | Attributes | Name | ||
---|---|---|---|---|---|---|
character(len=*), | intent(in) | :: | kernel_name |
Name of CUDA kernel |
||
integer(kind=int8), | intent(in) | :: | ndims |
Number of dimensions |
||
integer(kind=int64), | intent(in) | :: | base_storage |
Number of bytes needed to store single element |
||
type(dtfft_transpose_t), | intent(in) | :: | transpose_type |
Transpose id |
||
logical, | intent(in) | :: | enable_packing |
If data should be manually packed or not |
||
logical, | intent(in) | :: | enable_multiprocess |
If thread should process more then one element |
Resulting code