Data-Driven Use-Cases towards Exascale

The different use-cases in Work Package 4 (Data-Driven Use-Cases towards Exascale) have been working hard during the summer and have some exciting new progress to share with the CoE RAISE community.

Use case 1: Collision event reconstruction at CERN’s Large Hadron Collider

During the summer, we have been busy exploring the use of a special kind of quantum computing called quantum annealing (QA), which can solve a specific group of mathematical problems called QUBO problems. QUBO is the abbreviation for Quadratic Unconstrained Binary Optimization and some examples of tasks that can be formulated as QUBO problems include the traveling salesman problem, the job shop scheduling problem, and the training of Support Vector Machines (SVM) or Support Vector Regression (SVR) models.

Within CoE RAISE, the partners European Organization for Nuclear Research (CERN), University of Iceland (UOI), and Forschungszentrum Jülich (FZJ) have successfully applied for computing time on the D-Wave Advantage™ System JUPSI at the Jülich Supercomputing Centre. JUPSI is the first D-Wave system in Europe and has more than 5,000 superconducting qubits! Now, 5,000 qubits may sound like a lot, but we are still quite limited in the kind of problems we are able to run on a system like this.

Inspired by previous work on Quantum-SVR (QSVR), we use the JUPSI system to train QSVR models with the aim of speeding up hyperparameter optimization of Deep Learning-based (DL) AI-models, similar to what has previously been done by others using classical SVR models. The work on QSVR and the use of JUPSI is done in collaboration with experts on AI and quantum computing from Work Package 2 of CoE RAISE (AI- and HPC-Cross Methods at Exascale).

We train the AI-based MLPF model (which we described briefly in a previous news article 2021-09) on the publicly available Delphes dataset to create a dataset for the QSVR training. The QSVR is then trained to predict the final model performance of MLPF, using only a fraction of the MLPF learning curve as well as the hyperparameter configuration as input. Looking at Figure 1, which shows a sketch of two made-up learning curves, the QSVR would take the beginning of the curves, before the point marked as decision point, as input and try to predict the values of the curves at the point marked as target. Figure 2 shows a few examples of actual learning curves from the MLPF training with each curve corresponding to a model with different hyperparameters.

Figure 1: A sketch of two hypothetical learning curves.

Figure 2: Some examples of MLPF learning curves from
epoch 25 to 100.

This is still a work in progress, so please be on the lookout for some fresh results in the coming months. We are excited to share them with you soon.

Use case 2: Seismic imaging with remote sensing for energy applications.

Remote Sensing

We have generated land cover classification maps for the year 2018, 2019 and 2020 for the whole of the Netherlands with an established Machine Learning (ML) method (Random Forest), and we are now starting an assessment of the added value provided by advanced DL models, with the aim of offering a seasonal update. We are now shifting our attention from the initial phase of prototyping to the validation of the generated products, specifically to estimate the accuracy of the detection of real changes that take place on the ground and the robustness against noisy observations.

Seismic Imaging

In the past years, the state-of-the-art seismic data imaging methodology called Joint Migration Inversion (JMI) was developed. Under the RAISE project, we aim to replace its computationally expensive wave modeling part with an efficient ML-based propagator. This approach will help the community to make the modeling and inversion processes much faster and efficient, promoting its uptake in various projects, including the green energy-transition field where budgets may be limited. A newly developed ML-based code for wave propagation has been successfully tested for some synthetic seismic datasets. The network using a Fourier Neural Operator (FNO) outperformed regular neural networks for a single frequency component of tested seismic data. Results of its accuracy are shown in Figure1. With such neural networks, we hope to speed up the seismic wave modeling process.

Based on this milestone, we are moving forward to handle further foreseen challenges faced by the seismic imaging community, i.e., processing of huge data volumes, seismic waves modeling and inversion for multi-frequencies, improving its compatibility for real field datasets while maintaining the accuracy. We exploit the redundancy in the seismic data across frequencies and use the above-mentioned methods to do modeling for a subset of frequencies, while ML is used to reconstruct the additional frequencies to get the required broadband data. Many DL frameworks have already been tested for this problem like Generative Adversarial Networks (GANs) [1], U-Net [2], and pix2pix [3] network, and preliminary results are quite promising.

Finally, would like to announce that we just received the seismic data for our test line from the Netherlands [4], related to geothermal investigations. This will be the input for the JMI process as well as the satellite data integration, more information can be found in a previous news article here on the RAISE official website.

Figure 3: Showing the best comparison between original and predicted data through our newly developed Artificial Intelligence (AI) based software for seismic wave modeling.

Use case 3: Defect-free Metal Additive Manufacturing

We have completed the assembly of a decently sized dataset. While our in-line monitoring setup is no longer a limitation on the parameter space we can cover, post-hoc CT-scanning is also needed to provide ground truth. Stainless steel is so dense, and the porosities we are looking for are so small, that even our purpose-designed, thin artifacts are quite challenging to image throughout. But we've found a provider that can resolve all defects, and they happily scanned a bunch of our artifacts.

Figure 4: Horizontal slice of a high-resolution CT-scan. Brightness corresponds to density. The lines of darker dots are porosities, holes in the material. The two large indents at the lower edge are for calibration. The brighter central disk is an artifact of the scanning procedure.

We've gladly made use of the supercomputer facilities available in RAISE to train and test several computer vision models on the data. The most straightforward task is to reconstruct the laser's speed and power from observations of the melt pool. While this may seem superfluous because the laser's parameters are planned upfront (otherwise the 3D printer couldn't work), such a model is already useful on its own as a strong deviation of the prediction from the planned parameters indicates an anomaly in the printing process. This task is analog to activity classification from video, for which a rich literature and many existing models are available. We've tested simple Convolutional Neural Networks (CNNs), 3D-CNNs (including SlowFast), Autoencoders, and a video Transformer model. We're not keeping the fun to ourselves; we plan to make the data available , so anyone can try to come up with even better models.

All our training experiments are logged to a ClearML Server instance, which runs in an OpenStack cloud near the HPC infrastructure.

Use case 4: Sound Engineering

The primary focus of our sound engineering efforts is to investigate how the geometry of your outer ear affects how you hear the world. Better understanding of these effects will lead to more accurate spatial audio reproduction, which has applications in a variety of fields, including virtual reality, hearing aids, and safety systems.

As part of this investigation, we are producing a new dataset of ear geometry and corresponding acoustic measurements. We have now completed the construction and validation of our test apparatus and methods and are proceeding with the main data collection activities. We digitally manipulate 3D scans of volunteers to produce controlled series of ear shapes that we’d like included in our training data. We then manufacture physical replicas of these ears using a process of 3D printing and silicone casting, and measure their acoustic properties in an anechoic chamber as can be seen in figure 5.

Figure 5: Left: 3D Printing a mold for casting a replica ear. Right: Measuring the acoustics of a replica ear.

While the data collection is ongoing, we are proceeding with our modeling work. We expect that the final model produced for this use case will be in the form of an Autoencoder, which is composed of two submodels: An encoder that is responsible for processing ear geometry information and a decoder that is responsible for replicating the acoustic effects of the ear. We are using separate pre-existing databases of ear shapes and acoustic measurements to determine the ideal structure of these two necessary submodels, and Transfer Learning from these preliminary models will allow us to train our final model more quickly.

References

[1] Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A. and Bengio, Y., 2020. Generative adversarial networks. Communications of the ACM, 63(11), pp.139-144.

[2] Ronneberger, O., Fischer, P. and Brox, T., 2015, October. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention (pp. 234-241). Springer, Cham.

[3] Isola, P., Zhu, J.Y., Zhou, T. and Efros, A.A., 2017. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1125-1134).