[Tau-announcements] New versions of TAU, PDT, and LiveDVD released

Sameer Shende sameer at cs.uoregon.edu
Thu Nov 10 21:09:46 PST 2011


	We are pleased to announce the release of TAU v2.21:


The following new features have been added since TAU 2.20.3 released on Aug. 18, 2011.
Our SC'11 demo schedule is at the end of this e-mail.

1. Binary rewriter based on MAQAO

We introduce a new tool, tau_rewrite, that uses MAQAO technology from Intel Exascale Lab, U. Versailles, that is integrated in PDT v3.17 for rewriting dynamic executables. It currently supports routine level instrumentation for C and Fortran binaries under Linux x86_64. To use this tool, configure TAU with PDT v3.17 and invoke:
% tau_rewrite a.out -o a.inst
You may supply configurations (MPI is there by default) using the -T <options>:
% tau_rewrite -T mpi,papi,pdt a.out -o a.inst
% mpirun -np 256 ./a.inst

This -T commandline option works consistently across tau_run (based on DyninstAPI) as well as tau_exec (based on runtime preloading of shared objects).

We wish to thank the Intel Exascale Lab team for their help and support in integrating this binary rewriter in TAU.

2. OpenSHMEM Profiling
TAU supports OpenSHMEM profiling interface now. Configure TAU with the -shmem option (with OpenSHMEM bin directory in your path) to use this feature.

3. Score-P Atomic/Context Events
TAU can now generate atomic and context events using the Score-P (www.score-p.org) measurement substrate when it emits the TAU snapshot profile data format.  Context events are listed with their full callpaths. Score-P version 1.0 beta must be used with TAU.

4. Opari2
Opari has been updated to Opari2 version 1.0 beta.  This should help with common
instrumentation issues and will also allow the instrumentation of OpenMP 3.0

5. Support for NVIDIA CUPTI v4.1

This release of TAU supports CUDA 4.1. It includes support for tracking of the time spent on the GPU for each Memory Copy between the Host and Device. It also supports performance analysis of asynchronous memory copy techniques such as
overlapping memory copies with GPU kernel execution. To use this feature, please download the latest NVIDIA driver (v285.05.15 or better) and related CUDA 4.1 distribution. We would like to thank the NVIDIA Corporation for their support of the TAU project.

6. H2 databse in PerfDMF

We have integrated the H2 database in TAU. Earlier, we provided support for the Derby file system based database that didn't require special servers to be run on pre-assigned ports (like PostgreSQL and MySQL). However, Derby didn't support concurrent access to the performance database. With the H2 database, concurrent execution is supported without any system administration previleges to support server tasks. It is also the default database that is configured when a user issues:
% perfdmf_configure --create-default

7. Support for debugging
We have introduced a new environment variable (export TAU_TRACK_SIGNALS=1) that allows TAU to capture the callstack at the time of program failure and record these as metadata from each thread of execution and write it in the profile files. This allows most common signals to be caught with a TAU instrumented executable.

8. ParaProf enhancements
ParaProf and PerfDMF now support Score-P's CUBE4 scalable profile data format. ParaProf supports a new view for visualizing atomic and context event data in the 3D topology display. It also supports max/min/mean/std. deviation profile summaries that may be created when a user executes an application with TAU_SUMMARY=1 to write just the summary data.

9. New compiler support
a. MINGW cross compilers for Windows 7 running under Linux/Windows. TAU also supports Microsoft MPI and runs on the Windows HPC Server 2008 and Azure cloud computing platform.  We would like to thank Microsoft Corporation's Microsoft Developer Platform Evangelism team for their support of TAU.
b. Intel v12.x compilers
c. NAG 5.3 Fortran compilers

10. UPC Instrumentation
TAU now supports the Rose parser from LLNL for UPC instrumentation (beta). See examples/upc for an example. The Rose parser is integrated in PDT v3.17 and emits PDB records that are used by the tau_instrumentor. The parsers cxxparse and cparse can use the roseparse tool transparently when an error occurs. This provides a wider coverage for C/C++ instrumentation without requiring any changes in the build system. We would like to thank the LLNL Rose team (rosecompiler.org) for their support.

Program Database Toolkit (PDT)

PDT v3.17 includes the Rose parser and MAQAO binary instrumentation tool.

for VirtualBox appliance. Please e-mail us for the password on this VM.

We have integrated the new releases of our software in a new distribution. This distribution now features:
* Performance evaluation tools: Score-P beta 1.0, TAU 2.21, PDT 3.17, Vampir 7.4 and 7.5 (featuring OTF1 & OTF2 support in Score-P), PAPI v4.2 and v4.2 with CUPTI 4.1 component supporting CUDA 4.1, CUBE 4.0 beta, Scalasca 1.3.3, VampirTrace 5.12, PerfSuite 1.1.0, ISP 0.3.0, PPW 2.6.2, DyninstAPI 7.1, kcachegrind, marmot, unimci, ParaVer, UNITE
* Runtime systems: OpenSHMEM 1.0, GASNet v1.18, CUDA 4.1, OpenMPI 1.4.2, 1.4.3, ptoolsrte 0.31
* Debuggers: Totalview 8.7.0-2
* IDEs: Eclipse Indigo and PTP with remote component/execution support
* Languages: Chapel 1.4.0, Python, pyMPI (ptoolsrte), Berkeley UPC 2.12.1
* Build tools; Cmake 2.8.3
* Numerical libraries from the ACTS collection:
	- Trilinos 10.8.3, PETSc 3.2p5, SuperLU 4.0, Overture 23, Metis 4.0, Scalapack 1.8.0, Slepc 3.2.p1, sundials 2.4.0, Globus 5.0.3, HDF5 1.8.5, Global Arrays 5.0.2, ParMetis 3.1.1.
	- Trilinos supports Amesos, Anasazi, AnasaziEpetra, AztecOO, Belos, BelosEpetra, Epetra, FEI, Galeri, Ifpack, Isorropia, Kokkos, Komplex, Mesquite, ML, ModeLaplace, RTop, Sacado, Teuchos, Thracore, Thyraepetra, Tpi, and Zoltan solvers in our distribution.

8. SC'11 Schedule
I. Tutorials
We will be helping/participating in the following tutorials at SC'11:

a. S07: 8:30am-5pm: TCC 202:
Hands-on Practical Hybrid Parallel Application Performance Engineering:

b. S04: 8:30am-5pm: TCC 101:
A New and Improved Eclipse Parallel Tools Platform: Advancing the Development of Scientific Applications:

c. Mon: 8:30-5pm, TCC LL5:
Scalable Heterogeneous Computing on GPU Clusters

II. Demo station
We will be demonstrating TAU at the NNSA/ASC booth #803 at
SC'11. Please stop by our TAU demo station:
Monday: 7pm-9pm,
Tuesday: 10am-12pm, 2pm-4pm,
Wednesday: 10am-11am,
Thursday: 10am-11am, 2pm-3pm.

PGAS Booth #124 on the 4th floor of the convention center:
Wednesday: 2pm-4pm.

Microsoft Booth: TAU demo station.

Thursday, Nov 17, 12:15pm - 1:15pm WSCC 2A/2B
Score-P BOF: The Score-P Community Project -- An Interoperable
Infrastructure for HPC Performance Analysis Tools.

   We would like to thank our partners in the VI-HPS, and ParaTools, Inc. HPC Linux projects for their contributions to this iso image.

   Please let us know if we may assist you with our tools in any way.
   - Sameer
  (for tau-team@ cs.uoregon.edu)

More information about the Tau-announcements mailing list