The POINT of Performance
Petascale Productivity from Open, Integrated Tools (POINT)
The National Science Foundation (NSF) has recently funded a project that will integrate, harden, and deploy an open, portable, robust performance tools framework for productive performance engineering of petascale applications on the NSF TeraGrid systems. The multi-institutional POINT project, funded by the NSF Software Development for Cyberinfrastructure (SDCI) program, partners the University of Oregon, the University of Tennessee, the National Center for Supercomputing Applications (NCSA), and the Pittsburgh Supercomputing Center (PSC).
According to Allen Malony of the University of Oregon, the project’s principal investigator:
“Now is the time to transfer successful, robust parallel performance infrastructure to an integrated, extensible, and sustainable performance tools suite that will be improved and supported for the long term to enable productive use of petascale HPC systems. In addition, if HPC resources are to be maximized, human-centric investments must also be made to help train application developers to be good performance engineers. ”
The POINT project will improve and support a parallel performance environment that integrates the widely-used TAU, PAPI, KOJAK, and PerfSuite technologies as core components. Each tool will be enhanced to better support user needs and evolving scalable HPC technology, and to interoperate as part of a performance engineering system to be used routinely in the performance evaluation and optimization of domain science and engineering (S&E) applications running on HPC systems of extreme scale.
This performance software foundation will be complemented by a community-driven education and training initiative to increase human productivity in performance engineering efforts across multiple S&E fields. The POINT training program for performance technology and engineering will be piloted and refined at the Pittsburgh Supercomputing Center and integrated with the TeraGrid Education, Outreach, Training (EOT) mission over time. The objectives are to educate application developers and students in sound performance evaluation methods, to teach them best practices for engineering high-performance code solutions based on expert tuning strategies, and to train them to use the performance tools effectively.
The POINT project will demonstrate the performance tool suite and performance engineering practice through application engagements with the NAMD, NEMO3D, and ENZO projects. Collaboration with ENZO developers will address their needs for getting large AMR problems to fit efficiently on newer multicore architectures. Performance engineering work on the NanoHub server at Purdue University will enable it to scale to handle large-scale problems (e.g., with NEMO3D) and large numbers of users. Integration of POINT tools with Charm++ will enable applications supported by that system (e.g., NAMD) to achieve better performance results.
The NSF awarded $2.19 million for the POINT project, which just completed its first quarter of the three-year grant. During this time, the team has outlined strategies for integration and interoperability between component tool sets, and begun planning pilot sessions for POINT training. The individual tools from the POINT project are currently available on TeraGrid systems, including deployment of TAU and PAPI on the Ranger machine at the Texas Advanced Computing Center (TACC).