Sound engineering, in more detail, refers to acoustic and tactile Engineering (ACUTE) being driven forward by the Simulation and Data Lab (SDL) ACUTE in Iceland in collaboration with FZJ in RAISE.
There is an essential element of ACUTE in individual 3D spatial auditory displays for immersive virtual environments. 3D sound technologies can provide accurate information about the relationship between a sound source and the surrounding environment, including the listener herself/ himself. This information cannot be substituted by any other modality (e.g. visual or tactile). Nevertheless, today's spatial representation of audio tends to be simplistic and with poor interaction capabilities, being multimodal systems primarily focused on graphics processing and integrated with basic audio solutions. This use case in RAISE aims to convey environmental information via acoustics using binaural sounds (3D).
Typically, binaural audio technologies rely on head-related transfer functions (HRTFs), specific digital filters that capture the human head's acoustic effects. Obtaining personal HRTF data is only possible with expensive equipment and invasive recording procedures. Figure 1 shows non-individual HRTFs acoustically measured on anthropomorphic mannequins that are used instead. The drawback with non-individual HRTFs is that these transfer functions likely never match the listener's unique anthropometry—especially in the outer ear—resulting in frequent localization errors such as front/back reversals, elevation angle misperception, and inside the-head localization.
Figure 1: Data collection example: 3D scanning of outer ears and acoustic measurements
Figure 1 includes illustrations of the methodology of this use case in RAISE. It performs a data collection of an extensive dataset of HRTFs. The 3D scanning of outer ears relates to the HRTF measurements and create 3D silicone models to place on the mannequin. Acoustic measurements of silicon modelled ears placed on a mannequin (carried out in an Anechoic chamber at UoI) deliver anthropometric measurements of the silicon modelled ears. RAISE aims to develop a technically sound methodology for HRTF analysis and synthesis to establish a physical connection between anthropometric and acoustic data in each structural component. That includes an extensive evaluation procedure compared to HRTFs obtained through the developed structural model against the measured HRTFs.
Figure 2: Societal impact of RAISE approaches will support travel aids for the visually impaired
Figure 2 illustrates the selected technological outcomes of the RAISE approaches built on previous EU projects' achievements (e.g., Sound of Vision that won the ICT2018 Award). Realistic 3D auditory displays represent an innovative breakthrough for a plethora of application areas. RAISE will apply innovative AI-related methodologies to support the learning of realistic 3D sounds. Selected AI models include sequence techniques such as Long-Short Term Memory (LSTM) and Gated Recurrent Units (GRUs). Since measurement datasets might be insufficient for training deep learning networks, cutting-edge data augmentation techniques and transfer learning methods will be used. Neural Architecture Search (NAS), based on reinforcement learning and evolutionary algorithms, performs hyper-parameter optimization of those deep learning networks. These AI-based approaches are massively computationally demanding problems, and thus, High-Performance Computing (HPC) resources are necessary.
Figure 2 shows only one example of RAISE outcomes contributing to the project's societal impact by supporting travel aids for the visually impaired. The approaches researched in RAISE in sound engineering will have further innovative breakthroughs for many other application areas. Examples include personal cinema, teleconferencing systems, and computer games based on virtual reality (i.e., immersion).