Towards more efficient and accurate algorithms in science and industry using machine learning.

 

 

 

One of the primary goals in CoE RAISE is the development and expansion of Artificial Intelligence (AI) methods in line with representative use cases from research and industry. These have a strong focus on data-driven technologies, i.e. analyzing data-rich descriptions of physical phenomena. Example use cases vary widely and range from fundamental physics and remote sensing to 3D printing and acoustics.

In our September news item last year, we introduced the data-driven use cases of work package 4 (WP4). In this article, we will have a closer look at what the progress has been so far, and what comes next.

Use case 1: Collision event reconstruction at CERN’s Large Hadron Collider

The high-energy physics (HEP) research community is set to see an enormous increase in data production in the coming decades. One of the many different approaches being investigated to tackle this is the replacement of traditional HEP algorithms with faster, parallelizable AI-driven approaches. These approaches promise to deliver comparable physics performance and can relatively easily be accelerated by hardware such as graphics processing units (GPUs) or field-programmable gate arrays (FPGAs). The aim is to increase the processing speed — thereby making it possible to process and analyze more data — while also maintaining the accuracy of the algorithms. The use of such hardware accelerators also often improves energy efficiency.

Particle-flow reconstruction algorithms provide an example of traditional algorithms that could potentially be replaced by AI-based approaches. These algorithms process signals from different sub- detectors, such as tracker systems and calorimeters, combining the information in those signals to construct higher level physics objects.

The work on a machine-learned particle-flow (MLPF) reconstruction algorithm, which is carried out in collaboration with the CMS experiment at CERN, has come a long way since the start of the project. We are not only contributing to the continuous development of the MLPF algorithm, but we also capitalise on high-performance computing (HPC) resources for efficient training and hyperparameter optimization.

A new data-loading pipeline, which we have called heptfds, was developed to provide faster and easier data loading when training. heptfds gets its name from HEP and the Tensorflow datasets package — often imported as tfds — upon which it is built.

The work on hyperparameter optimization of MLPF was presented at the 20th International Workshop on Advanced Computing and Analysis Techniques in Physics Research (ACAT) in November 2021 [1]. In addition, a more detailed description of the MLPF model and its physics performance was presented at the same conference [2]. [1] shows the impact HPC resources can have on our ability to perform large-scale hyperparameter optimization. The performance increase achieved due to hyperparameter tuning would not have been attainable without access to RAISE’s HPC resources. The hyperparameter search would have taken roughly half a year on a single modern GPU, compared to approximately 80 hours using HPC resources. The model’s performance increase is illustrated in Figure 1, in which the training and validation loss curves are shown before and after hyperparameter optimization.

fig1.png

Figure 1: Comparison of the model training performance before (left) and after (right) hypertuning. Training and validation losses as a function of the training epoch. The solid lines are the average of 10 trainings with identical hyperparameter configurations. The shaded regions show the mean ± one standard deviation.

The next steps in the work on MLPF include generating new, much larger datasets with improved ground-truth definitions. We hope training on this new data will unlock further potential in the MLPF algorithm.

In parallel, we have also spent recent months working on a benchmarking effort for large HPC workloads. One of the main goals is early identification of demanding I/O requirements of big data applications. We use this information to better match a workload’s I/O needs to a site's given storage and I/O services, as well as to better inform the choice of I/O strategies used in application development, yielding higher I/O performance and reducing potential I/O bottlenecks on shared services. The MLPF codebase has been containerized for mobility and reproducibility, and will be an early example of how to characterize and benchmark HPC I/O services for large data workloads. Initial work has been presented to the benchmarking working group of the Worldwide LHC Computing Grid (WLCG), where it was well received.

Use case 2: Seismic imaging with remote sensing for energy applications.

Seismic imaging

Seismic imaging is currently the best possible technology to image Earth subsurface structures with a high level of confidence and is an indispensable tool for the discovery of oil and gas reservoirs. With environmental issues coming to the fore, seismic imaging is also being used for exploring the Earth’s subsurface to discover clean energy sources such as geothermal energy.

This technology, however, faces considerable computational challenges which stem from having to solve a large scale inverse problem. This involves processing vast amounts of 3D seismic data as well as accounting for multiple wave reflections occurring within the Earth’s subsurface. Datasets can be comprised of 10,000’s source locations, each emitting seismic waves that are reflected in the earth and then recorded by thousands of receivers spread over the surveyed area. Using a basic sampling rate of 12.5m to 50m in both spatial directions, and covering areas spanning hundreds of square kilometres, the datasets often reach 100’s of Tbytes, or even reaching the Pbyte scale.

With the advent of AI technologies in recent years, RAISE aspires to tackle these data-driven challenges by optimizing seismic imaging technologies using deep learning (DL) , simulation and data assimilation approaches. To achieve this, we are working towards replacing the computationally expensive components of seismic imaging, such as the forward modeling problem, with an AI-based one.  To this end, state-of-the-art AI-based frameworks, such as Physics-informed neural networks (PINNs) [3] and the Fourier neural operator (FNO) [4] are currently being developed for the AI-driven modelling of seismic downgoing (into subsurface) and upgoing (back to receivers on Earth’s surface) waves to produce accurate and efficient AI models for the propagation of seismic waves. For the training, a subset of the required wavefields is used (e.g., a subset of wave frequencies or a subset of source locations). In addition, physical relationships, related to e.g. one-way wave propagation between depth levels, will be incorporated as physical constraints, so that the resulting AI model produces physically relevant results.

Once these methodologies are implemented, we plan to first test them on synthetic models and data as a means to assess their accuracy and efficiency. In a second step, they will be deployed on real field data. After completing our AI-based algorithms for seismic wave propagation, we plan to also explore the use of AI models for the seismic inversion stage where the reflectivity structures and velocity distribution need to be extracted. In accordance with the Joint Migration Inversion approach [5], we will iteratively use forward modeling and compare the modeled responses with the measured ones in order to yield insight into how the unknown parameters (velocity and reflectivity) need to be updated.

Since the aim under the RAISE task is to deploy both remote sensing and seismic imaging in an integrative approach, we need to have access to datasets from an area, which contain both seismic and satellite information. Our efforts to identify openly available datasets have come to fruition with the identification of a dataset on geothermal reservoirs from the Netherlands [6], which will be explored by both subtasks once the tools have been tested.

Remote Sensing

The generation of consistent and frequently updated land cover maps is of crucial importance for the monitoring of the surface of the Earth. We are working towards a reliable and scalable framework aiming at continuously updating such maps. Some challenges are still lying in front of us as we’re starting to address the fundamental ones.

As our framework is designed to work on a large amount of data, it is an application where DL can be beneficial. The choice of hyperparameters is of key importance for DL models to deliver the best performance. However, finding the best set of hyperparameters is often computationally expensive as lots of different models have to be fully trained. To reduce this cost we experimented with increasing the batch size during the training process, which resulted in shorter training times. We submitted a conference paper showing the speed-ups achieved at the IGARSS 2022 conference [7] .

fig2.png

Figure 2: Portion of the considered study area (Trentino, Italy): (a) number of land cover changes per pixel obtained with the proposed multi-year method, (b) number of land cover changes per pixel obtained with the single year baseline method, (c) the true colorcomposition of a Sentinel-2 image acquired in 2018, and (d) corresponding land cover map obtained [6].

We are also working towards the validation of our approach, to demonstrate its capability of generating land cover maps that are robust against noise but are also able to detect change of the land cover. We have been testing various classification methods, working with Random Forests, Support Vector Machines, but also DL methods such as Long short-term memory (LSTM). The generated output are maps with predicted land cover classes, as shown in Figure 2. A paper with the preliminary results was also submitted at IGARSS 2022 [8].

Use case 3: Defect-free Metal Additive Manufacturing

To demonstrate the potential of machine learning for manufacturing applications, we characterize and develop an anomaly detector for Selective Laser Melting (SLM), an industrial 3D printing process for metal. The focus is on a specific type of anomalies called keyhole porosities [9, 10] that directly impacts integrity and therefore longevity of a printed object and can only be detected non-destructively afterwards in very thin objects using quite expensive and time-consuming X-ray CT-scanning. To alleviate the need for this post-processing, we propose to train and compare different video anomaly detection neural networks such as [11], [12] or [13] on this specific task.

fig3.png

Figure 3: Distributions for laser speed and laser power, tuned to achieve nominal mean power density while respecting the physical process and equipment constraints.

The obvious first step towards this goal is the creation and curation of a dataset. We performed a parameter space exploration of both laser power and laser speed. To optimize the data yield of the experiment while maintaining sufficient X-ray penetration, we print cylinders with a diameter of 8mm. The parameters of almost all laser paths in the bulk of these cylinders were randomized (i.i.d. sampled from truncated normal distributions). This achieves two objectives: it first provides enough combinations and variability of speed and power to characterize the process well, allowing to train robust and generalized neural networks and, secondly, creates anomalies in the part at a greatly increased rate compared to nominal operation.

fig4.png

Figure 4: 2cm cylinders example. Virtual CAD model on the left and printed object on the right.

Use case 4: Sound Engineering

The overall aim of our sound engineering work is to use DL approaches to produce high-precision spatial audio algorithms that convey accurate location information to humans via generated sound. A key component of this work is identifying a way to evaluate the accuracy of candidate algorithms that does not require an in-person listening test.

To this end, we have developed a class of machine learning-based models that replicates the sound localization capabilities of individual listeners.  This consists of a common neural network architecture and a training regime that will tailor it to match a particular individual. Figure 5 summarizes the localization accuracy of these models for various individuals. A paper describing the development of this model has been submitted to the Sound and Music in Computing conference [14], and one describing the speed-ups we were able to obtain through the use of high-performance computing was presented at the IT2022 conference in Feb. 2022 [15].

fig5.png

Figure 5: Accuracy and training cost for sound localization models of various individuals. Blue points were used to determine the parameters used for all models.

We are also in the process of collecting a novel acoustic dataset to aid in the development of these spatial audio algorithms.  This relates anthropometric information, such as 3D scans, to the corresponding acoustic effects.  Our methods for creating this dataset will be presented at the upcoming meeting of the Acoustical Society of America [16].

While the collection of our own dataset is ongoing, we will utilize existing third-party datasets to prototype the DL techniques we’ll need going forward.

References

[1] E. Wulff, M. Girone, J. Pata, (2021) Hyperparameter Optimization of Data-Driven AI models on HPC Systems, In Proceedings of the ACAT 2021 Conference, J. Phys.: Conf. Series

doi: https://doi.org/10.48550/arXiv.2203.01112

[2] J. Pata, J. Duarte, F. Mokhtar, E. Wulff, J. Yoo, J.R. Vlimant, M. Pierini, M. Girone, (2021) Machine Learning fo Particle Flow Reconstruction at CMS, In Proceedings of the ACAT 2021 Conference, J. Phys.: Conf. Series

doi: https://doi.org/10.48550/arXiv.2203.00330

[3] Raissi M., Perdikaris P., Karniadakis G. (2019) Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations J. Comput. Phys., 378, pp. 686-707.

[4] Li, Z., Kovachki, N., Azizzadenesheli, K., Liu, B., Bhattacharya, K., Stuart, A. and Anandkumar, A., 2020. Fourier neural operator for parametric partial differential equations. arXiv preprint arXiv:2010.08895.

[5] Staal, X.R., 2015. Combined imaging and velocity estimation by joint migration inversion. Technische Universiteit Delft (cit. on p. 43).

[6] https://scanaardwarmte.nl/ 

[7] “ACCELERATING HYPERPARAMETER TUNING OF A DEEP LEARNING MODEL FOR REMOTE SENSING IMAGE CLASSIFICATION”, M. Aach, R. Sedona, A. Lintermann, G. Cavallaro, H. Neukirchen, M. Riedel, IGARSS 2022 (under review)

[8] “AN AUTOMATIC APPROACH FOR THE PRODUCTION OF A TIME SERIES OF CONSISTENT LAND-COVER MAPS BASED ON LONG-SHORT TERM MEMORY”, R. Sedona, C. Paris, L. Tian, M. Riedel, G. Cavallaro, IGARSS 2022 (under review)

[9] Thanki, A., Goossens, L., Mertens, R., Probst, G., Dewulf, W., Witvrouw, A., & Yang, S. (2019). Study of keyhole-porosities in selective laser melting using X-ray computed tomography. Proceedings of iCT 2019, 1-7. 

[10] Heylen, R., Thanki, A., Verhees, D., Iuso, D., De Beenhouwer, J., Sijbers, J., ... & Bey-Temsamani, A. (2022). 3D total variation 

denoising in X-CT imaging applied to pore extraction in additively manufactured parts. Measurement Science and Technology, 33(4), 045602. 

[11] Sultani, W., Chen, C., & Shah, M. (2018). Real-world anomaly detection in surveillance videos. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 6479-6488). 

[12] Feng, J. C., Hong, F. T., & Zheng, W. S. (2021). Mist: Multiple instance self-training framework for video anomaly detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 14009-14018). 

[13] Tian, Y., Pang, G., Chen, Y., Singh, R., Verjans, J. W., & Carneiro, G. (2021). Weakly-supervised video anomaly detection with robust temporal feature magnitude learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 4975-4986).

[14] E. M. Sumner, R. Unnthorsson, and M. Riedel, “Replicating human sound localization with a multi-layer perceptron.” Submitted to 19th Sound and Music Computing Conference, Saint-Étienne, France, 2022
[15] E. M. Sumner, M. Aach, A. Lintermann, R. Unnthorsson, and M. Riedel.  “Speed-Up of Machine Learning for Sound Localization via High-Performance Computing.” 26th International Conference on Information Technology, Žabljak, February 2022

[16] E. M. Sumner, M. Riedel, and R. Unnthorsson.  “Design and manufacture of synthetic pinnæ for studying head-related transfer functions.”  182nd Meeting of the Acoustical Society of America, Denver, CO, USA, May 2022.