SUMMARY OF THE TALKS

09:00 – 10:45

Session 1: Welcome and Introduction to the Projects and NCCs

09:00 – 09:10

Welcome – Tomas Karasek (NCC CZ)

09:10 – 09:25

Introduction of the EuroHPC JU and EuroCC project – Tomas Karasek (NCC CZ)

09:25 – 09:45

Introduction of the NCC Czech Republic – Tomas Karasek (NCC CZ)

09:45 – 10:05

Introduction of the NCC Germany – Sohel Herff (NCC DE)

10:05 – 10:25

Introduction of the NCC IS – Simulation & Data Labs of NCC Iceland – An Invitation for International Cooperation – Morris Riedel (NCC IS)

10:25 – 10:45

Introduction of the CoE RAISE – Andreas Lintermann (FZJ - RAISE)

11:20 – 13:00

Session 2: AI- and HPC-Cross-Methods at Exascale

11:20 – 11:40

Porting, optimisation and performance analysis of RAISE software stack – Guillaume Houzeaux (BSC – RAISE)

The architectures of next-generation supercomputers with Exascale power will evolve around the current modular and heterogeneous setups. Such a modular approach is especially suited for compute- and data-centric workflows that may require different High-Performance Computing (HPC) architectures for the various potentially concurrently running workflow components. During the course of RAISE, to achieve the highest possible performance, the HPC codes and ML tools are continuously being ported, optimised and analysed. The different systems tested are those belonging to partner centres, but also those accessed through PRACE access calls. We will present some of these optimisations and performance analysis.

11:40 – 12:00

From prototypes to exascale systems: testing and optimising existing ML frameworks – Eray Inanc (FZJ – RAISE)

Within RAISE, the developers have access to various hardware prototypes, which represent ideal playgrounds for testing purposes. Several highly scalable ML frameworks such as PyTorch Distributed Data Parallel, Horovod, HeAT, or DeepSpeed, are ported and optimised on these prototypes. As training enormous datasets require such frameworks, the findings here have the potential to influence the co-design of next-generation exascale systems by providing feedback to the maintainers and operators.

12:00 – 12:20

Hybrid Quantum-Classical Methods for Machine Learning – Marcel Aach (FZJ – RAISE)

Quantum Annealers (QA) that have become available in the last few years are well suited to solve certain optimisation problems much faster than a regular computer.  However, the size of the actual problems that can be computed on a QA is still limited. Therefore hybrid machine learning methods are necessary to run larger problems on the QA. In the RAISE project, such a hybrid quantum-classical method is explored for finding the optimal hyperparameters of neural networks.

12:20 – 12:40

Towards adoptions of a Unique AI Framework for the EuroHPC JU Systems Ecosystem – Morris Riedel (UOI – RAISE)

CoE RAISE develops a Unique AI Framework for HPC systems co-designed by various scientific and engineering applications that take advantage of AI at scale. The talk will present the co-design activities and provide an overview of the initial framework blueprint. Requirements and framework component details will be provided, including a path towards broader adoption of the framework beyond CoE RAISE and its partners.

12:40 – 13:00

Lessons learned of applying cross-sectional AI/HPC Methods in Scientific & Engineering Applications – Morris Riedel (UOI – RAISE)

CoE RAISE identifies, develops, and adopts various cutting-edge AI techniques in compute- and data-driven applications. The talk will present an overview of these techniques and what concrete applications are using specific AI/HPC methods at scale. Lessons learned on adopting these techniques and a path towards broader adoption of the AI methods beyond CoE RAISE applications will be provided.

14:00 – 16:00

Parallel Session 3.1: Compute-Driven Use Cases (CFD and AI)

14:00 – 14:20

Surrogate-based optimisation for turbulent boundary layer flows – Fabian Hübenthal (RWTH – RAISE)

CFD at the HPC scale is costly, and the design space is too vast to be sufficiently analysed by grid search methods. In this talk, surrogate-based strategies like support vector regression-based ridgeline and Bayesian optimisation are explored to efficiently guide the design decision process. Wall-resolved large-eddy simulations of the active drag reduction of turbulent boundary layer flow using spanwise travelling transversal surface waves provide the data, which are augmented by prior knowledge.

14:20 – 14:40

Geometrical surrogates in CFD Cristóbal Samaniego (BSC – RAISE)

Complex flow problems involve relatively small rigid body obstacles embedded in large computational domains, like for example wind turbines in wind farms. This combination of disproportionate geometrical scales leads to very large meshes and small-time steps. The implementation we propose substitutes the complex high-resolution geometries of the obstacles with lower-resolution zones of variable anisotropic porosity, which is included in the Navier-Stokes equations. The porosity mimics the presence of obstacles by slowing down the fluid and provoking an associated pressure gradient.

14:40 – 15:00

AVBP-DL: a prototype for introducing AI into large physical solvers – Corentin Lapeyre (CERFACS – RAISE)

Large physics solvers incorporate many person-years of knowledge to achieve accurate simulation of complex phenomena on massively parallel machines. Hybrid strategies suggest that incorporating deep learning into such solvers could alleviate bottlenecks, increase internal model accuracies, or decrease the time to result. But they rely on modern programming frameworks and tools not designed for interoperability with physics solvers. We present here a set of strategies to effectively set this coupling up, depending on the constraints of the physics problem and the neural network architecture, on hybrid hardware composed of CPUs and GPUs.

15:00 – 15:20

Innovative hydrogen-burning zero-emission aircraft engine design using data-driven models – Corentin Lapeyre (CERFACS – RAISE)

Hydrogen combustion is gaining interest in the decarbonization of the aviation sector. This fuel brings new engine development opportunities from the technical point of view since it allows lean burn modes, thus favouring ultra-low NOx emissions while providing zero-carbon emissions in flight. However, hydrogen flames are very fast and thin and imply the development of breakthrough combustion concepts and associated CFD tools. This talk highlights the specificities of hydrogen combustion physics and modelling and presents numerical approaches based on pure CFD or CFD-AI coupling for the simulation of H2 flames. First simulations of aviation hydrogen combustion technology are used to illustrate the role of CFD and AI in the development of the future generation of aircraft engines.

15:20 – 15:40

AI modelling of wetting hydrodynamics – Andreas Demou (CYI – RAISE)

In this study, we provide an assessment of the Fourier Neural Operator (FNO) in capturing droplet transport on chemically heterogeneous surfaces. Two distinct data-driven approaches are followed: (i) the FNO is applied as an iterative architecture and (ii) a low-order approximation is corrected with an FNO-based counterpart. The performance of each approach is evaluated, and the findings of the study can be used in guiding the deployment of the FNO on high-fidelity data.

15:40 – 16:00

When AI meets CFD Michal Kravcenko (NCC CZ)

The aim of the joint project between IT4Innovations National Supercomputing Center and Orgrez is to develop a software tool for the initial design of equipment for the catalytic reduction of nitrogen oxides. This tool will be developed using the neural networks-based surrogate model. To prepare datasets for training, validation and testing of the neural networks, methods of Computational Fluid Dynamics (CFD) are used. In this talk, the description of the problem will be presented together with the methods and tools to create it.

14:00 – 16:00

Parallel Session 3.2: Data-Driven Use Cases

14:00 – 14:20

Event reconstruction and classification at the CERN HL-LHC  Eric Wulff (CERN – RAISE)

Hyperparameter optimisation (HPO) of deep learning-based AI models is often compute resource intensive and calls for the use of large-scale distributed resources as well as scalable and resource-efficient HPO algorithms. We leverage the benefits of HPC and quantum computing for HPO by exploring new techniques.  We also investigate the use of SVR models and Quantum-SVR (QSVR) models for model performance prediction with the potential application of improving hyperparameter optimisation.

14:20 – 15:00

Seismic imaging with remote sensing for energy applications – Gabriele Cavallaro (FZJ – RAISE) and Naveed Akram (CYI – RAISE)

Seismic imaging is the best possible technology to uncover the Earth's subsurface structures, whereas remote sensing is an indispensable tool for observing and monitoring the Earth's surface. Under CoE RAISE, we are pursuing the advancement of machine learning (ML) approaches across both fronts, aiming, towards the end of the project, to integrate both technologies in a synergistic framework where the outputs of remote sensing inform and guide those of seismic imaging.

 

Seismic imaging is used for the discovery of new energy resources including geothermal reservoirs with a high level of confidence. These, however, often rely on computationally expensive simulations and it is desirable to explore the potential of ML methodologies to be used to at least partly replace some of the associated computationally demanding components. In this work, we have explored a number of candidate generative neural network architectures for seismic wave modelling with the aim of integrating them in an inversion process, during which the subsurface features are uncovered from measurements recorded at the surface of the earth.

15:00 – 15:40

Predicting porosity formation during Selective Laser Melting: towards defect-free 3D-printing of stainless steel

Additive manufacturing (AM) of metal products is achieved by selective laser melting (SLM) at the surface of a metal powder bed.  To fulfil the need for quality assurance without tedious (and often destructive) post-process quality control, we develop predictive models to uncover defects based on process monitoring data.  The raw data stream is dominated in volume by high-speed video at 1 TB per hour.  Model exploration, training, and tuning is accelerated on HPC.

 

ML-Ops on HPC: Deploying ClearML Server on a modular supercomputer – Kurt De Grave (FM – RAISE)

ML-Ops is the set of best practices to develop, deploy, and maintain machine learning models.  The practices need to be supported by ML-Ops software tools.  To provide comprehensive ML-Ops functionality in a developer-friendly way without sacrificing transparency or the option to investigate potential performance issues, we deploy ClearML on HPC.  The various server components run in a Kubernetes cluster in the OpenStack-based private cloud at the Vlaams Supercomputer Center.  The client component runs on HPC nodes.  All traffic is encrypted, so the service can track experiments on different HPC clusters.

15:40 – 16:00

Towards personalized hearing in virtual environments using AI techniques – Morris Riedel (UOI – RAISE)

The talk describes the application domain of using binaural audio technologies that rely on head-related transfer functions (HRTFs), specific digital filters that capture the human head's acoustic effects. CoE RAISE application uses HPC/AI methods to overcome the challenge of using non-individual HRTFs that never match the listener's unique anthropometry, resulting in frequent localization errors such as front/back reversals, elevation angle misperception, and inside-the-head localisation.

16:20 – 17:20

Parallel session 4.1: NCC-Specific Activities 1

16:20 – 16:40

Hybrid AI/HPC workflow for the optimisation of material data to model the forming process during manufacturing – Li Zhong (NCC DE – HLRS)

In the provision of stable production processes for future sheet metal components, the finite element method (FEM) simulation has become more and more popular recently due to its versatility in encoding various material behaviour. We propose a novel approach which combines machine learning methods and FE simulation, where the DNN model is developed and trained to decide the simulation parameters automatically and reduces the demand for expertise, resources, and time in material parameter determination.

16:40 - 17:00

Increasing Usability by Using Interactive HPC – Jens Henrik Göbbert (NCC DE – JSC)

Interactive exploration and analysis of large amounts of data from scientific simulations, in-situ visualisation, and application control are convincing scenarios for explorative sciences. JupyterLab enables to combine interactive with reproducible computing. It meets the challenges of support for a wide range of different software workflows. However, a number of challenges must be mastered in order to make existing HPC environments ready for intuitive interactive high-performance computing.

17:00 – 17:20

Analysis of log-data of an HPC-system – Sameed Hayat (NCC DE – HLRS)

The High-Performance Performance Computing Centre Stuttgart (HLRS) provides HPC-based infrastructures for data analytics, artificial intelligence and machine learning. These platforms are collecting a lot of metrics each minute in an Elasticsearch database. There are a wide variety of causes for failures at various levels of HPC systems: bit-errors in memory, memory failures, core failures, node failures, and interconnect communication failures. We apply machine learning and deep learning techniques to perform predictions to identify possible performance degradation of the HPC system in advance.

16:20 – 17:20

Parallel session 4.2: NCC-Specific Activities 2

16:20 – 16:40

AI-activities and infrastructures at HLRS – Oleksandr Shcherbakov (NCC DE – HLRS)

The High-Performance Performance Computing Centre Stuttgart (HLRS) is driving the uptake of big data analytics, artificial intelligence and machine learning while using HPC-based infrastructures for more than six years. This talk will present insights from the project CATALYST that evaluates both the software and hardware requirements of bringing AI workflows onto HPC yielding eventually hybrid HPC/AI workflows. The talk concludes with an overview about the existing AI infrastructure at HLRS.

16:40 - 17:00

Exploring Next Generation AI Systems at LRZ – Nicolay J. Hammer (NCC DE – LRZ)

New generations and architectures of specialised AI hardware have been emerging during the last few years. LRZ has recently procured a system based on Cerebras’ 2nd generation wafer-scale engine technology. The highly integrated HPAI system based on an extreme number of sparse linear algebra optimised cores, promises outstanding training performance for neural network designs with limited scalability. We plan to evaluate this technology together with selected users regarding user experience, application performance and to explore different operations models.