upc_all_gather_all function
#include <upc.h>
#include <upc_collective.h>
void upc_all_gather_all(shared void * restrict dst,
shared const void * restrict src,
size_t nbytes,
upc_flag_t flags);
The upc_all_gather_all function
copies a block of memory from one shared
memory area with affinity to the ith thread to the ith block of a shared
memory area on each thread. The number of bytes in each block is nbytes.
nbytes must be strictly greater than 0.
The upc_all_gather_all function treats the src pointer as if it pointed to a
shared memory area of nbytes bytes on each thread and therefore had type:
shared [nbytes] char[nbytes * THREADS]
and it treats the dst pointer as if it pointed to a shared memory area with
the type:
shared [nbytes * THREADS] char[nbytes * THREADS * THREADS]
The targets of the src and dst pointers must have affinity to thread 0.
The src and dst pointers are treated as if they have phase 0.
The effect is equivalent to copying the ith block of nbytes bytes pointed to
by src to the ith block of nbytes bytes pointed to by dst that has affinity
to each thread.
upc_all_gather_all for the static THREADS translation environment.
#include <upc.h> #include <upc_collective.h> #define NELEMS 10 shared [NELEMS] int A[NELEMS*THREADS]; shared [NELEMS*THREADS] int B[THREADS][NELEMS*THREADS]; // Initialize A. upc_barrier; upc_all_gather_all( B, A, sizeof(int)*NELEMS, UPC_IN_NOSYNC | UPC_OUT_NOSYNC ); upc_barrier;
upc_all_gather_all all for the dynamic THREADS translation
environment.
#include <upc.h> #include <upc_collective.h> #define NELEMS 10 shared [NELEMS] int A[NELEMS*THREADS]; shared int *Bdata; shared [] int *myB; Bdata = upc_all_alloc(THREADS*THREADS, NELEMS*sizeof(int)); myB = (shared [] int *)&Bdata[MYTHREAD]; // Bdata contains THREADS*THREADS*NELEMS elements. // myB is MYTHREADÕs row of Bdata. // Initialize A. upc_all_gather_all( Bdata, A, NELEMS*sizeof(int), UPC_IN_ALLSYNC | UPC_OUT_ALLSYNC );