Fault Tolerance is a cross-cutting issue that spans all layers of the hardware/software stack, and hence, requires coupled improvements in each layer and co-design between the different layers. FTS aims at providing a venue for researchers to share experiences across the hardware/software layers and attendees to get a holistic view of fault tolerance techniques, especially with a focus on HPC and parallel computing.
- Bogdan Nicolae, Argonne National Laboratory, US
- Guillaume Aupy, INRIA, France
Architects, administrators, and users of modern high-performance computing (HPC) systems strive to meet goals of energy, resource, and application efficiency. Optimizing for any or all of these can only be accomplished through analysis of appropriate system and application information. While application performance analysis tools can provide application performance insight, the associated overhead typically decreases application performance while the tools are being employed. Thus they are typically used for application performance tuning but not for production application runs. Likewise traditional system monitoring tools in conjunction with analysis tools can provide insight into run-time system resource utilization. However, due to overhead and impact concerns, such tools are often run with collection periods on order of minutes or only used to solve problems and not during normal HPC system operation. There are currently few, if any, tools that provide continuous, low impact, high fidelity system monitoring, analysis, and feedback that meet the increasingly urgent resource efficiency optimization needs of HPC systems.
Modern processors and operating systems being used in HPC systems expose a wealth of information about how system resources, including energy, are being utilized. Lightweight tools that gather and analyze this information could provide feedback, including run-time, to increase application performance; optimize system resource utilization; and drive more efficient future HPC system design.
The goal of this workshop is to provide an opportunity for researchers to exchange new ideas, research, techniques, and tools in the area of HPC monitoring, analysis, and feedback as it relates to increasing efficiency with respect to energy, resource utilization, and application run-time.
- Ann Gentile, Sandia National Laboratories, US
- Jim Brandt, Sandia National Laboratories, US
- Benjamin Allan, Sandia National Laboratories, US
- Cory Lueninghoener, Los Alamos National Laboratory, US
- Nichamon Naksinehaboon, Open Grid Computing, US
- Boyana Norris, University of Oregon, US
- Narate Taerat, Open Grid Computing, US
Held as part of IEEE Cluster 2018, the Third International Workshop on Representative Applications is concerned with the development and use of representative applications (such as mini-applications or proxies) for all aspects of high-performance computing.
The challenges posed by future systems: new architectures, programming models, and machine scales; mean that representative applications as an essential tool in the path towards exascale. These applications can be used for acceptance s Alamos National Laboratory testing, benchmarking, optimization evaluation, and also investigating the performance of new architectures or programming models. However, the development of such representative applications must ensure they maintain some correspondence with their parent application. Maintaining that correspondence requires methods and tools for identifying the performance critical aspects of a given code, or class of codes, and also for verifying that the proxy accurately models the target behavior.
The main aim of WRAp is to provide a venue in which the wide range of disciplines involved in creating and using representative applications can share important discoveries and lessons learned from these applications.
- Tom Scogland, Lawrence Livermore National Labs, US
- David Beckingsale, Lawrence Livermore National Labs, US
The commoditization of high performance computing to a broader range of applications coupled with the reduction in performance improvement from traditional scaling technologies has led to a broad interest in a number of new compute acceleration technologies from GPGPUs to CGRAs and FPGAs. Meanwhile, SIMD widths have been widening to try and keep up with computational demand and general purpose architectures have started incorporating features from the vector architectures that used to dominate high performance computing. From IBM’s Vector Media eXtension (VMX) to NEC’s SX architecture to Intel’s Advanced Vector eXtension (AVX) to ARM’s recently announced Scalable Vector Extension (SVE). All of the major general purpose architectures seem to have embraced a return to vector based functionality. Supporting these hardware developments there are a number of features being proposed for incorporation into modern programming models and languages in order to support the vector additions as well as restructuring memory access in order to feed the computational pipelines. Meanwhile application developers have been hard at work trying to refactor code to take advantage of wider vector units and more complicated memory hierarchies. Tools and techniques for developing for these new vector architectures are still evolving, particularly on emerging languages and runtimes.
- Luiz DeRose, Cray, US
- Eric Van Hansbergen, ARM, US