Our training sessions at a glance

Since the last month, a lot has happened on our YouTube channel and for this reason, we want to give you an overview of the past training sessions.

HPC Systems Engineering in the Interaction Room - 08.04.2021

This is still a work in progress, so please be on the lookout for some fresh results in the coming months. We are excited to share them with you soon.

This seminar informs about the general Interaction Room technique that facilitates interdisciplinary collaboration in complex software projects, emphasizing intertwined AI and HPC. An Interaction Room is a (physical or virtual) room that is outfitted with several large analogue or digital whiteboards known as canvases. They are used to visualize and facilitate discussion of critical aspects of a complex software system. Each canvas is dedicated to modeling a particular perspective on the system. The key difference to other modelling techniques is that models in the Interaction Room are kept deliberately informal. Hence, the goal is not to create a perfect specification but to encourage stakeholders from diverse backgrounds to discuss those aspects that are essential to the software project’s success.

The seminar demonstrates how the CoE RAISE aims to perform co-design with this Interaction Room technique to understand the domain requirements, understand technical restrictions, identify aspects of particular scientific value, and identify the most critical risks of those projects. The seminar further outlines initial lessons learned in intertwined HPC and AI applications. Using the Interaction Room at an early project stage helps prevent costly misunderstandings and oversights later on and has already proven helpful in numerous complex information systems projects.

Parallel & Scalable Machine & Deep Learning driven by High Performance Computing (HPC) - 12.05.2021

Many of the significant challenges that society faces, whether it is preserving our environment, improving our healthcare, or rebuilding our economy, are underpinned in some way or another by High-Performance Computing (HPC). This lecture will briefly review how innovative Artificial Intelligence (AI) techniques such as deep learning or quantum machine learning leverage HPC to create economic and societal benefits. As HPC has become an indispensable asset in the global data economy crucial to address societal challenges and increase industry competitiveness, Icelandic Research & Development activities are presented that shape Europe’s digital future via EuroHPC and other related activities.

Interactive HPC with JupyterLab - 26 & 27.05.2021

CoE Training Course - "Interactive HPC with JupyterLab" - Part 1 Interactive exploration and analysis of large amounts of data from scientific simulations, in-situ visualization and application control are convincing scenarios for explorative sciences. Based on the open source software JupyterLab, a way has been available for some time now that combines interactive with reproducible computing while at the same time meeting the challenges of support for the wide range of different workflows. The approach enables the creation of documents that combine live code with narrative text, mathematical equations, visualizations, interactive controls, and other extensive output.

However, a number of challenges must be mastered in order to make existing workflows ready for interactive high-performance computing. With so many possibilities, it's easy to lose sight of the big picture. The course offers an introduction to the world of possibilities of JupyterLab.

Git-based Data Management with the Open-source DataLad Tool - 28.05.2021

The recording demonstrates how CoE RAISE and other computational-intensive and data-intensive communities can benefit from the free DataLad tool. It enables researchers to discover data since it has built-in support for metadata extraction and search. HPC & AI researchers often consume data in different ways requiring direct access to individual files, especially when using a few files from some large datasets for analysis. DataLad enables that and supports also sharing datasets with the public or just some colleagues on platforms without the need for a central service for publishing datasets. Version control systems such as GIT are a de-facto standard for open-source software development. A similar level of tooling enables the DataLad tool for data management and analysis. HPC & AI researchers benefit from comprehensively track the exact state of any analysis inputs that produced results across the entire lifetime of a project and multiple datasets, enabling reproducibility.

Distributed Deep Learning - 29.07.2021

To leverage the power of high-performance computing (HPC), use case developers of CoE RAISE need to adopt techniques of distributed deep learning. The difference between using distributed deep learning in contrast to traditional deep learning is that CoE RAISE developers and others in the community leverage many GPUs instead of a single workstation client equipped with one GPU. The seminar will outline why that approach is necessary for CoE RAISE to scale towards Exascale and present standard techniques to perform distributed deep learning at scale today. In addition, the seminar will highlight specific challenges of distributed deep learning, such as the size of the batch size. Finally, the seminar introduces distributed deep learning and its adoption in CoE RAISE use cases. It also discusses a possible adoption of Horovod as a concrete AI tool in the RAISE unique AI framework design. Among many possible solutions for distributed deep learning on HPC, Horovod has shown excellent scaling across a high number of nodes, making it a distinct candidate for the CoE RAISE unique AI framework design.

Autoencoders - 31.08.2021

The CoE RAISE adopts a wide variety of AI models and algorithms in nine application use cases that co-design a unique AI framework for Exascale. Among those AI models, using AutoEncoder (AE) models is getting increasingly popular, achieving innovative results in various CoE RAISE use cases. The seminar will provide a profound introduction to the general approach to using AEs and why they are relevant in the context of the CoE RAISE. One of the key ideas of using AEs is to learn a representation (i.e., encoding) for datasets, typically for dimensionality reduction, by training the network to ignore insignificant data (i.e., noise). In addition, the seminar will present and discuss some recent developments in using AE, and its various model variants, such as Variational AEs (VAEs) within CoE RAISE use cases.

MLOps with ClearML - 30.09.2021

AI approaches such as machine, or deep learning models and algorithms are typically very chaotic for each CoE RAISE use case. That means AI researchers suddenly drop specific models or add new models with a new set of parameters. Performing AI modelling is a very lively process whereby CoE RAISE aims to align as best as possible to the industry standard of the Cross-Industry Standard Process for Data Mining (CRISP-DM). The seminar will introduce the idea of MLOps to cure this chaos by using tools for managing automation, orchestration, and reproducibility. Researchers in CoE RAISE adopt MLOps via the ClearML toolset, and thus the seminar will provide lessons learned and insights on using that tool in a real AI use case.

Hyperparameter Tuning with Ray Tune - 29.10.2021

The CoE RAISE project co-designs and uses a unique AI framework to develop novel AI techniques in terms of deep learning and machine learning models as part of scientific and engineering applications. This seminar introduces one component of the unique AI framework that enables automated hyperparameter tuning of those deep learning and machine learning models. By leveraging high-performance computing (HPC), this component provides the capability to develop models with excellent performance and accuracy. In addition, the seminar will introduce the field of hyperparameter optimisation and provide insights into a concrete AI tool called Ray Tune used in CoE RAISE across a wide variety of scientific and engineering use cases. Finally, the seminar introduces the broader optimisation approach called Neural Architecture Search (NAS) and provides selected application examples.

Accelerating Machine Learning with GraphCore - 23.11.2021

The CoE RAISE project co-designs and uses a unique AI framework to develop novel AI techniques that benefit from high-performance computing (HPC) in various scientific and engineering use cases. Complex HPC systems typically leverage a high number of CPUs and GPUs combined with an extraordinary good interconnect across the system. This seminar introduces the Intelligence Processing Unit (IPUs) of a GraphCore system, a new disruptive technology entering the HPC market. The IPU is an entirely new processor designed for AI applications like within CoE RAISE. The seminar thus will provide information on how AI researchers can leverage the IPU’s unique architecture to create better models faster than ever. Finally, the seminar compares the new IPU approach with existing CPU and GPU approaches. It offers insights on using IPUs with AI tools such as TensorFlow, which is also part of the CoE RAISE framework design.

Graph Neural Networks - 31.03.2022

The application use cases of the CoE RAISE project co-design and use a unique AI framework to develop novel AI models and techniques that benefit from high-performance computing (HPC). Most use cases develop models in innovative deep neural networks, such as Convolutional Neural Networks (CNNs), Long Short-Term Memory (LSTM), or AutoEncoders. This seminar introduces the idea of Graph Neural Networks (GNNs) and their benefits in identifying patterns in graphs with complex relationships and interdependencies of particular objects. GNNs are helpful because many data in applications have an underlying graph structure and a non-regularity in their data structures. While the seminar will cover several fundamentals about GNNs, it will also include complex use case application examples about using GNNs in numerical simulations and particle reconstruction.

Quantum Support Vector Machine Algorithms - 21.04.2022

Today, all application use cases of the CoE RAISE project leverage the power of high-performance computing (HPC) systems via Central Processing Units (CPUs) or Graphical Processing Units (GPUs). However, several use cases have recently started exploring a disruptive computing technology known as quantum computing. This seminar will introduce one particular quantum computing approach called quantum annealing and will describe why it may be one form of computing in the future. Researchers use quantum annealing in CoE RAISE to solve complex optimization problems inherent in machine and deep learning algorithms. The seminar will provide particular examples in the context of a traditional but still relevant machine learning model called support vector machines (SVMs). Besides showing classification examples with SVMs, the seminar will also cover support vector regression techniques.

Using OpenML for sharing datasets, algorithms, and experiments - 31.05.2022

The CoE RAISE project develops a unique AI framework leveraging high-performance computing (HPC) environments to enable faster machine and deep learning model training. That benefits the CoE RAISE application use cases and many other scientific and engineering domains that adopt AI techniques. This seminar will introduce the OpenML platform as one option of CoE RAISE to link to the larger AI community and offer components from the unique AI framework to many researchers worldwide using OpenML. The seminar describes the approach of OpenML as an open platform for sharing machine learning datasets, algorithms, and experiments. In addition, the seminar discusses potential collaboration opportunities through interoperability between OpenML and CoE RAISE. Hence, the OpenML platform acts as one outreach channel to the international AI community, potentially leveraging open standards such as the Open Neural Network Exchange (ONNX).

Accelerating Machine Learning with CUDA - 09.06.2022

The training webinar covers the theoretical and practical principles of massively parallel GPU computing with CUDA technology in the context of machine learning. Among to the overview of the CUDA architecture and programming model, the seminar will discuss the advanced aspects of machine learning acceleration in GPU hardware perspective.

High Performance Data Analytics with the Helmholtz Analytics Toolkit (HeAT) - 28.06.2022

The CoE RAISE project develops many AI methods in nine compute-intensive and data-intensive use cases. The use case researchers leverage various AI tools on heterogeneous high-performance computing (HPC) systems and co-design the RAISE unique AI framework towards Exascale. The seminar demonstrates how CoE RAISE and other computational-intensive and data-intensive communities can benefit from the free Helmholtz Analytics Toolkit (HeAT). The goal of HeAT is to fill the gap between data analytics and machine learning libraries with a strong focus on single-node performance on the one hand and traditional HPC on the other. HeAT's generic Python-first programming interface integrates seamlessly with the existing data science ecosystem in CoE RAISE. It makes it as effortless as using NumPy to write scalable scientific and data science applications. The seminar provides a sophisticated introduction to Heat and its use cases and discusses a possible adoption of HeAT in the RAISE unique AI framework design.

Towards a CoE RAISE Unique AI Software Framework for Exascale - 29.08.2022

Different frameworks are available to distribute the training of AI models across different GPUs. This seminar presents a step-by-step guidance on how to use and deploy three well-established deep learning frameworks on different HPC systems and also explores an application of a use case in CoE RAISE on a turbulence dataset.

Transformer Models - 26.09.2022

The CoE RAISE project develops several AI methods in nine compute-intensive and data-intensive use cases. The use case researchers leverage various AI methods using heterogeneous high-performance computing (HPC) systems and co-design the RAISE unique AI framework towards Exascale. After a short introduction to CoE RAISE, the seminar demonstrates how CoE RAISE and other computational-intensive and data-intensive communities can benefit from cutting-edge AI approaches such as transformer models. Furthermore, it will introduce the benefits of representation learning for unsupervised learning in particular and attention mechanisms in transformer models in general. Finally, application examples and use cases are part of the seminar in the specific context of the different models.

CoE RAISE - drive. enable. innovate.

Last but not least, we do not want to withhold the self-made image video of our project.

Our training sessions at a glance

More videos to follow.