This module describes transpose_handle_cuda class
It is responsible for managing CUDA-based transposition operations
It executes transpose kernels, memory transfers between GPUs, and data unpacking if required
Nodes of different colours represent the following:
Solid arrows point from a submodule to the (sub)module which it is
descended from. Dashed arrows point from a module or program unit to
modules which it uses.
Where possible, edges connecting nodes are
given different colours to make them easier to distinguish in
large graphs.