upc_all_permute function
#include <upc.h>
#include <upc_collective.h>
void upc_all_permute(shared void * restrict dst,
shared const void * restrict src,
shared const int * restrict perm,
size_t nbytes, upc_flag_t flags);
The upc_all_permute function
copies a block of memory from a shared memory
area that has affinity to the ith thread to a block of a shared memory
that has affinity to thread perm[i]. The number of bytes in each block is
nbytes.
nbytes must be strictly greater than 0.
perm[0..THREADS-1] must contain THREADS
distinct values:
0, 1, ..., THREADS-1.
The upc_all_permute function treats the src pointer and the dst pointer
as if each pointed to a shared memory area of nbytes bytes on each thread
and therefore had type:
shared [nbytes] char[nbytes * THREADS]
The targets of the src, perm, and dst pointers must have affinity to thread 0.
The src and dst pointers are treated as if they have phase 0.
The effect is equivalent to copying the block of nbytes bytes that has affinity
to thread i pointed to by src to the block of nbytes bytes that has affinity
to thread perm[i] pointed to by dst.
upc_all_permute
#include <upc.h> #include <upc_collective.h> #define NELEMS 10 shared [NELEMS] int A[NELEMS*THREADS], B[NELEMS*THREADS]; shared int P[THREADS]; // Initialize A and P. upc_barrier; upc_all_permute( B, A, P, sizeof(int)*NELEMS, UPC_IN_NOSYNC | UPC_OUT_NOSYNC ); upc_barrier;