Keeneland

From TAU Wiki

Jump to: navigation, search

Contents

Guide for using TAU on Keeneland

Slide about TAU

TAU overview slides


Setting up environment

setup your environment this way:

   module load tau
   export TAU_MAKEFILE=$TAUROOT/lib/Makefile.tau-cupti-pdt

Compiling SHOC 1.0.1 with TAU

After configuring SHOC edit the config/common.mk to:


   # === Basics ===
   CC       = tau_cc.sh
   CXX      = tau_cxx.sh
   LD       = tau_cxx.sh
   AR       = /usr/bin/ar
   RANLIB   = ranlib
  
   CPPFLAGS += -I$(SHOC_ROOT)/src/common -I${SHOC_ROOT}/config
   CFLAGS   += -m64 -g -O2
   CXXFLAGS += -m64 -g -O2
   ARFLAGS  = rcv
   LDFLAGS  =
   LIBS     = -L$(SHOC_ROOT)/lib  -lrt -L/sw/keeneland/cuda/3.2RC/lib64/ -lcudart
  
   USE_MPI         = no
  
   OCL_CPPFLAGS    += -I${SHOC_ROOT}/src/opencl/common
   OCL_LIBS        =
  
   NVCC            = /sw/keeneland/cuda/3.2/bin/nvcc
   CUDA_CXX        = tau_cxx.sh
   CUDA_INC        = -I/sw/keeneland/cuda/3.2/include
   CUDA_CPPFLAGS   += -gencode=arch=compute_10,code=sm_10 \
   -gencode=arch=compute_11,code=sm_11  -gencode=arch=compute_13,code=sm_13 \
   -gencode=arch=compute_20,code=sm_20  -gencode=arch=compute_20,code=compute_20 \
   -I${SHOC_ROOT}/src/cuda/include $(TAU_LIBS)
  

Then make/install as you normally would.

More info at: TAU's userguide

Building SHOC with VampirTrace

In this case edit the config/common.mk to read:

   # === Basics ===
   CC       = vtcc --vt:cc mpicc
   CXX      = vtcxx --vt:cxx mpicxx
   LD       = vtcxx --vt:cxx mpicxx
   AR       = /usr/bin/ar
   RANLIB   = ranlib
  
   CPPFLAGS += -I$(SHOC_ROOT)/src/common -I${SHOC_ROOT}/config
   CFLAGS   += -m64 -g -O2
   CXXFLAGS += -m64 -g -O2
   ARFLAGS  = rcv
   LDFLAGS  =
   LIBS     = -L$(SHOC_ROOT)/lib  -lrt -L/sw/keeneland/cuda/3.2RC/lib64/ -lcudart
  
   USE_MPI         = no
  
   OCL_CPPFLAGS    += -I${SHOC_ROOT}/src/opencl/common
   OCL_LIBS        =
  
   NVCC            = vtnvcc
   CUDA_CXX        = vtnvcc
   CUDA_INC        = -I/sw/keeneland/cuda/3.2/include
   CUDA_CPPFLAGS   += -gencode=arch=compute_10,code=sm_10 \
   -gencode=arch=compute_11,code=sm_11  -gencode=arch=compute_13,code=sm_13 \
   -gencode=arch=compute_20,code=sm_20  -gencode=arch=compute_20,code=compute_20 \
   -I${SHOC_ROOT}/src/cuda/include $(TAU_LIBS)

Running CUDA applications

Both CUDA and OpenCL are instrumented dynamically through library preloading, use the tau_exec script to run the CUDA application:

   %> tau_exec -T serial,cupti -cupti ./Stencil2D

The -T serial specifies with TAU configuration to use, you can change this for MPI applications and run:

   %> mpirun -np 4 tau_exec -T mpi,cupti -cupti ./SGEMM

This could be done with executables build with or without TAU.

Traces

Traces can be recorded by first setting:

   %> export TAU_TRACE=1
   %> tau_exec -T serial,cupti -cupti ./Stencil2D
   %> tau_multimerge
   %> tau2slog2 tau.trc tau.edf -o stencil2d.slog2
   %> jumpshot stencil2d.slog2

Running OpenCL applications

Use tau_exec as well:

   %> tau_exec -T serial -opencl ./SGEMM 


Performance Data

Some example performance data from S3D:

Image:S3D-cuda.ppk and Image:S3D-cuda.slog2

Personal tools