ToM on Cray CNL

From Tau Wiki
Jump to: navigation, search

Updated: 6/18/2010

Known issues:

1. MRNet will not build under the PGI environment on both kraken and jaguar. This is a known issue with the MRNet team. Michael Brim from U of Wisconsin says this is a problem with the way the STL libraries are used and they are trying to address this.

2. ToM will build under both PGI and GNU environments. Of course, this does not help since PGI and GNU mangles C++ names differently. Attempting to use ToM/PGI on an application with MRNet/GNU will expose this problem.

3. The above presents problems with applications like PFLOTRAN which currently works under the PGI (and cray) programming environments but I have failed to get it working under the GNU environment.

4. The ToM front-end needs to be started in the background simultaneously with the application back-ends (instrumented by TAU and managed by ToM). The cray scheduling environment allows this, but requires (for good reason) that the node (size of 12) boundaries be respected. So, if processes are put into background on a set of nodes, those nodes cannot also host any new processes that are subsequently started by aprun.

Possible solutions/workaround:

1. Making ToM use MRNet's lightweight interface is a possible solution. The lightweight interface is a C-based interface and is not subject to name mangling. ToM/PGI should then work fine with MRNet/GNU and PFLOTRAN/PGI. In any event, the lightweight interface is required for BG/P operations, so this should be a win-win.