Research
Research Topics
Modular and Automated Monitoring and Analysis Infrastructures for HPC systems
Holistic Performance Engineering and Modeling
HPC System Benchmarking – Taxonomy, Comparability, Reproducibility
Parallel I/O & Storage Optimization for Emerging Workloads and I/O Subsystems (DL, AI, Big Data, Workflows etc.)
Distributed Container Monitoring in HPC Systems
Simulation and Analysis of Heterogeneous Network and Storage Infrastructures in HPC and MSA systems
Analysis of Correlation between Network Traffic Patterns and File Access Patterns
Research Projects
NHR South-West (NHR@SW)
The NHR (national high-performance computing) network consists of several centers that both operate the high-performance computers and offer a coordinated consulting service on the methodological competence of scientific high-performance computing. The goal is to provide scientists at German universities with the computing capacity they need for their research and to strengthen their skills in using this resource efficiently.
NHR South-West (NHR@SW) is a collaboration of Goethe University Frankfurt, Johannes Gutenberg University Mainz, University of Kaiserslautern-Landau and Saarland University, embedded into the German NHR association. Combining the expertise of the partners in HPC and Artificial Intelligence, it offers researchers access to unique, state-of-the-art HPC services and systems. Our researchers work in close collaboration with scientific domain scientists in the Simulation and Data Labs and develop new innovative methods in the Method Labs.
Further information on how to apply for compute time via NHR-JARDS can be found here.
MAWA-HPC: Modular and Automated Workload Analysis for HPC
Given the complexity of modern supercomputers and HPC systems, achieving theoretical peak performance depends on myriad parameters. In order to optimize the system performance and efficiently use the underlying resources, various methods can be applied, including simulation, benchmarking, and monitoring. However, these methods and the tools used are not compatible with each other, i.e., both the individual tools and the approaches consider only a certain part of a certain domain, e.g., network or I/O, resource allocation. At the same time, each of these approaches generates certain knowledge that can be applied to similar problems or for a certain system configuration. To avoid that such knowledge is generated only for one-time purposes, and also to support other users, this knowledge must be easily accessible and available to the community. The MAWA-HPC project aims to develop a universal workflow and tool suite (starting with the execution environment up to the evaluation dashboard) that can be applied to different use cases in different domains. Through its modular design, the workflow should be able to support different community tools at each stage, increasing the compatibility of each tool and covering new use cases.
People: Sarah Neuwirth (PI), Zhaobin Zhu (PhD Student), Niklas Barteilheimer (PhD Student)
Dissemination (selection):
ICPADS 2023 Research Paper
ZIH Colloquium Series 2023, Invited Talk
PASC 2023 Keynote Talk
ISC 2023 Research Poster
ESSA Workshop Paper @ IPDPS 2023
REX-IO Workshop Paper @ CLUSTER 2022
EUPEX Project: European Pilot for Exascale
The EuroHPC EUPEX consortium will design and build a European modular Exascale-ready pilot system integrating a European general-purpose processor technology (EPI), an interconnect technology (BXI) and a software stack for HPC based on a modular supercomputing architecture (MSA). The High Performance Computing and its Applications Group is contributing to the key technologies here. In particular, we will evaluate direct communication methods for distributed accelerator technologies and adapt them for the new pre-exascale evaluation platform EUPEX. We will also look at the aspect of I/O load balancing in parallel file systems for application and metadata and introduce suitable optimization approaches.
People: Sarah Neuwirth (Co-PI), Niklas Bartelheimer (PhD Student), Zhaobin Zhu (associated PhD Student)
Funding: EuroHPC JU, BMBF
Funding Period: 2022 to 2026
Dissemination (selection):
ACM TOS 2024 Journal Paper
IJNC 2024, Vol. 14, No. 1, Journal Paper
ICPADS 2023 Regular Research Paper
ISC 2023 Research Poster
ESSA Workshop Paper @ IPDPS 2023
APDCM Workshop Paper @ IPDPS 2023
Additional Information: https://eupex.eu/
Past Research Projects
NHR Project: Container Standards in HPC (2022-2023)
The NHR project is focused on implementing a central repository for curated user containers as well as containerized services that are portable among participating NHR centers along with other HPC sites. These developments will also be used to provide HPC-as-a-Service to NHR users. The project will also provide documentation and best practices for container runtimes and container management solutions and evaluates and implements security mechanisms and monitoring instrumentation for containers.
People: Sarah Neuwirth (Co-PI), Zhaobin Zhu (PhD Student), Mascha Magin (Bachelor Student)
Dissemination:
Bachelor Thesis, 2023