[Tau-announcements] TAU v2.20.3 released
sameer at cs.uoregon.edu
Thu Aug 18 16:44:05 PDT 2011
We are pleased to announce the release of TAU v2.20.3:
The following new features have been added since TAU v2.20.2 released on
May 13, 2011.
1. SHMEM Profiling
We have added support for tracking communication in SHMEM one-sided communication libraries. This release helps you profile shmem calls with support for C and Fortran interfaces and suports profiling communication matrix (TAU_COMM_MATRIX=1) and tracing one-sided communication primitives (TAU_TRACE=1). There are updates to the TAU API for tracking events that take place on remote nodes as well as merge and conversion tools.
2. Profiling with EBS (Event Based Sampling)
TAU Event-based sampling (EBS) now integrates sample information into TAU profile measurements at runtime. This feature introduces several significant changes to the way EBS is used:
i) The implementation of this feature changes the way EBS traces are
now handled. Previously, TAU_SAMPLING=1 enables EBS traces
only. Now, TAU_SAMPLING=1 enables EBS functionality independent
of the generation of traces or profiles. Instead, TAU_TRACE=1
enables EBS traces in conjunction with
TAU_SAMPLING=1. Orthogonally, TAU_PROFILE=1 (default) in
conjunction with TAU_SAMPLING=1 enables profile output from TAU
with integrated sample information.
ii) EBS-integrated profiles add two new pieces of
information. [SAMPLE] entries record (as best as possible, using
BFD) the name, file location and line number of sampled
instructions along with the sample count and approximate time
spent (based on the sampling period). These [SAMPLE] entries are
recorded in the context of any measured TAU events during which
the samples were taken. [INTERMEDIATE] entries are introduced
for each TAU event entry with samples. [INTERMEDIATE] entries
sum the values of time spent and sample count of all samples
taken in the associated TAU event context. The main purpose of
[INTERMEDIATE] entries are to provide a quick and easy way of
comparing sampled metric information against measured metric
To use this feature, you may set the TAU_SAMPLING environment variable to 1 or use tau_exec:
% tau_exec -ebs -ebs_period=<count> -ebs_source=<papi_event_name> ./a.out
with an uninstrumented or an instrumented binary. It will generate a flat profile with the sample data. Please configure TAU with -bfd=download to use the runtime demangler with support for translating addresses to function names for both static and dynamic executables if there are issues with BFD configuration.
This feature has been tested on i386_linux, x86_64, craycnl, and bgp, with GNU, Intel and Pathscale compilers where available. IBM XLC compilers may exhibit problems with address translation.
Our instrumentation approach is highlighted in our ICPP 2010 paper:
"Design and Implementation of a Hybrid Parallel Performance Measurement System"
3. GPU Profiling
Earlier releases supported NVIDIA's OpenCL library. This release also supports OpenCL in AMD Accelerated Parallel Processing (APP) architecture. This release also adds support for PyCUDA. The TAU-GPU domain is extended to PyCUDA using the new TauGpuAdapterPyCuda.cpp which is a Boost/Python enabled interface to the
CUDA Adapter. Build with -cuda=<dir> and -python flags to generate the module that may be loaded directly from Python.
ParaProf's topology display has been enhanced to support multiple interval and atomic events in designing a custom topology view of the performance data. A sorted by field appears for bar chart windows when it is different from the event/metric. The context event menu for UserEvent control was added to the context event table window. The auto-label option for nodes, threads is now applied to window names. ParaProf's source view window can handle large files with over 10000 lines and show the line numbers correctly. TAU's memory estimator when paraprof is spawned is updated. Also, TAU now supports OpenJDK.
5. SCORE-P integration
TAU can now generate callpath edges as well as node level data from the callgraph using the SCORE-P (www.score-p.org) measurement substrate when it emits the TAU snapshot profile data format.
6. Re-engineering TAU's tracing module
TAU can now track the the time spent in flushing event records to disk. The event records are flushed periodically when the buffer is filled up. Earlier, we used a static buffer size (64K records) for each thread. Now, the user may specify the number of records dynamically using the TAU_MAX_RECORDS environment variable.
7. Other enhancements and bug fixes
i. Cray CCE compilers compiler-based instrumentation (use TAU_OPTIONS=-optCompInst).
ii. Added support for system-wide TAU configuration (<taudir>/tau_system_defaults/tau.conf file). Includes support for tracking job id in the PROFILEDIR file name.
iii. GPGPU CUPTI implementation cleaned up.
iv. Added support for Cray xt-shmem modules (older 4.x as well as newer 5.x).
v. OpenMPI library configuration now finds and uses the correct directory containing the Fortran interface (mpi.mod) in TAU compiler scripts.
vi. Special storage allocator checks are now defined in configure and EBS to support new C++ compilers.
vii. Added support for IBM Power7 Linux systems with updated MPI configurations.
viii. Added support for tau_exec's runtime preloading based tracking for dynamic executables generated using the Cray CCE compiler (-dynamic).
ix. Fixes for tau_wrap (varargs).
8. SC'11 Tutorials
We will be helping/participating in the following tutorials at SC'11:
a. Hands-on Practical Hybrid Parallel Application Performance Engineering:
b. A New and Improved Eclipse Parallel Tools Platform: Advancing the Development of Scientific Applications:
c. Scalable Heterogeneous Computing on GPU Clusters
9. Updated ISO image
We would like to thank our partners in the VI-HPS, and ParaTools, Inc. HPC Linux projects for their contributions to this iso image.
It features the new version of TAU and updated licenses.
Please let us know if we may assist you with TAU in any way.
(for tau-team@ cs.uoregon.edu)
More information about the Tau-announcements