AES 2026 AVARIG Conference: Full Schedule

Schedule as of May 2026 - subject to change

Default Time Zone is EDT - Eastern Daylight Time

arrow_back View All Dates

9:30am CEST

Coffee

Thursday July 2, 2026 9:30am - 10:00am CEST

IRCAM:Gallery

Thursday July 2, 2026 9:30am - 10:00am CEST
IRCAM:Gallery 1, place Igor Stravinsky Paris 4e

Social event

12:30pm CEST

Lunch

Thursday July 2, 2026 12:30pm - 1:30pm CEST

IRCAM:Gallery

Thursday July 2, 2026 12:30pm - 1:30pm CEST
IRCAM:Gallery 1, place Igor Stravinsky Paris 4e

Social event

1:30pm CEST

(P) A Compact Inverse Auditory Model for Binaural Signal Reconstruction

Thursday July 2, 2026 1:30pm - 3:00pm CEST

IRCAM:Gallery

Binaural signal synthesis is typically formulated as forward modelling using head-related transfer functions (HRTFs). We explore an inverse auditory modelling perspective in which binaural ear signals are estimated directly from a source signal and its azimuth. We present a lightweight complex-valued neural network that predicts frequency-domain binaural filters from the input source spectrum and azimuthal direction, which are then applied to synthesize binaural signals. Controlled experiments evaluate how excitation bandwidth and angular sampling density affect reconstruction and generalization. Results show accurate spectral reconstruction and interpolation to unseen source directions even when training uses sparse angular grids, while bandwidth strongly influences problem conditioning and error behaviour. This work focuses on characterizing compact signal-conditioned inverse models as efficient components for binaural signal generation.

Speakers

Vlad Paul

Philip Nelson

Thursday July 2, 2026 1:30pm - 3:00pm CEST
IRCAM:Gallery 1, place Igor Stravinsky Paris 4e

Binaural, Poster

1:30pm CEST

(P) Short-Term VR Sound-Localization Training under Simulated Single-Sided Deafness: Evaluation of an Enhanced HRTF

Thursday July 2, 2026 1:30pm - 3:00pm CEST

IRCAM:Gallery

Single-sided deafness (SSD) reduces access to binaural cues and can make spatial-audio localization difficult in virtual reality (VR). This study investigated short-term localization training under simulated SSD in a VR task using generic, non-individualized head-related transfer function (HRTF) rendering with head-movement-contingent auditory updating, and examined whether an enhanced HRTF could improve performance by emphasizing monaurally available spectral cues at the better-hearing ear. The rationale was that, although directional judgment in normal binaural listening depends strongly on interaural differences, monaural listening must rely more heavily on direction-dependent spectral characteristics that remain available at the better-hearing ear. Twenty normal-hearing participants performed a 13-source horizontal-plane localization task using a VR headset and headphones under simulated SSD. Participants were assigned to either normal-HRTF training or enhanced-HRTF training (n = 10 each). The experiment comprised pre-test, three training sessions, and post-test, and all participants were tested with both normal and enhanced HRTFs, yielding four train-test combinations. Performance was evaluated using accuracy (ACC), mean absolute error (MAE), and response time (RT). Localization performance improved with training under the present VR simulated-SSD condition. ACC increased and MAE decreased from pre-test to post-test, whereas RT showed no clear change. No significant overall between-group difference in cumulative improvement was observed. However, during training, the enhanced-HRTF group showed a significant first-session advantage, and matched train-test combinations showed descriptively larger gains than mismatched combinations. These results suggest that short-term VR localization training can improve directional judgment under simulated SSD and that enhancing monaural spectral cues may provide an early benefit by making direction-specific patterns easier to associate with source direction. The findings are limited to localization performance in the present VR task under simulated SSD and should not be directly generalized to clinical SSD populations, real-world auditory rehabilitation, or broader everyday 3D spatial-audio experience.

Speakers

Kentaro Fujii

Ryugo Kijima

Thursday July 2, 2026 1:30pm - 3:00pm CEST
IRCAM:Gallery 1, place Igor Stravinsky Paris 4e

Binaural, Poster

1:30pm CEST

(P) An evaluation benchmark of artificial intelligence models for estimating head-related transfer functions (HRTFs) from ear shape representations

Thursday July 2, 2026 1:30pm - 3:00pm CEST

IRCAM:Gallery

Head-related transfer functions (HRTFs) are fundamental to spatial audio via binaural rendering. Personalized HRTFs have been shown to improve localization accuracy and reduce perceptual artifacts and directional ambiguities. However, acquiring such HRTFs is time-consuming and requires costly measurement setups. To address this limitation, this article investigates the use of deep learning models to estimate personalized HRTFs from ear shape representations. We propose and evaluate three different architectures with various types of input data and identify the minimum achievable spectral distance error when predicting true HRTFs magnitude spectra. The best model we evaluated achieves a test Log Spectral Distortion (LSD) of 4.93 dB. We also established a performance ranking based on input data types and architectural choices.

Speakers

Alexandre Philippon

Loïc Reboursière

Thierry Dutoit

Thursday July 2, 2026 1:30pm - 3:00pm CEST
IRCAM:Gallery 1, place Igor Stravinsky Paris 4e

HRTFs, Poster

1:30pm CEST

(P) Investigating the Effect of Sample Rate Variation on the Accuracy of Sound Source Localisation Using a Neural Network

Thursday July 2, 2026 1:30pm - 3:00pm CEST

IRCAM:Gallery

This paper describes an experiment to investigate how the localisation performance of a neural network for Sound Source Localisation named `SampleDOA\_SR' would be affected by reducing the sample rate of the audio training data. Reducing the sample rate has several benefits; most notably a reduction in training time. The goal is to determine an appropriate sample rate which balances both localisation accuracy and training time. This information will be used to inform the future training of a neural network for Sound Source Localisation which will be used in a stereo upmixing pipeline. The results of this experiment indicate reducing the sample rate from 48kHz down to below 4kHz results in a significant decrease in localisation accuracy. However, above 4kHz, the decrease in localisation accuracy is minimal whilst training time is reduced significantly. This suggests providing the particular application for the model does not require the highest level of accuracy, a minimal reduction in localisation performance may be acceptable to obtain a large reduction in training time which would also reduce the environmental impact of the model training. A sample rate of 16kHz is suggested as a suitable balance between accuracy and training time.

Speakers

Samuel Hobern

Alan Archer-Boyd

Damian Murphy

Thursday July 2, 2026 1:30pm - 3:00pm CEST
IRCAM:Gallery 1, place Igor Stravinsky Paris 4e

HRTFs, Poster

1:30pm CEST

(P) Optimising HRTFs to Improve Spatial Release from Masking

Thursday July 2, 2026 1:30pm - 3:00pm CEST

IRCAM:Gallery

Binaural hearing supports effective communication in complex acoustic environments by enabling listeners to segregate spatially separated sound sources, a benefit referred to as spatial release from masking (SRM). The spatial cues that give rise to SRM are determined by the head-related transfer function (HRTF). Although individual HRTFs are generally considered optimal for accurate localisation, prior work suggests they do not necessarily maximise performance across all aspects of spatial perception, including SRM. This motivates the concept of application-specific HRTFs. Here, we propose an application-specific HRTF augmentation method to improve speech intelligibility in cocktail-party scenarios, focusing on front–back configurations where SRM is limited. HRTFs are parameterised using principal component analysis and optimised via a differentiable auditory-model-based objective to enhance spectral cues while constraining interaural level differences. The method yields model-predicted SRM gains of 4–9 dB without inducing substantial predicted lateralisation artefacts.

Speakers

Nils Marggraf-Turley

Niels Pontoppidan

Lorenzo Picinali

Thursday July 2, 2026 1:30pm - 3:00pm CEST
IRCAM:Gallery 1, place Igor Stravinsky Paris 4e

HRTFs, Poster

1:30pm CEST

(P) Which Tracking Characteristics from an Audio Only VR Onboarding Predict the Best Performance HRTFs?

Thursday July 2, 2026 1:30pm - 3:00pm CEST

IRCAM:Gallery

This study is motivated by an ambition to determine the ‘best’-matching HRTFs during an onboarding task for an audio-only virtual reality (VR) experience using a ‘shooting down sound sources’ task. The study is motivated by the needs of blind and visually impaired gamers, who may rely more crucially on accurate rendering of auditory spatial cues for succeeding in the audio-only VR experience. We present an exploratory study applying an experimental VR test platform that renders ‘target’ sound sources in a virtual environment and logs tracking characteristics of head, hand-held controller and body while participants localise and ‘shoot’ audible ‘targets’ that are visible (for task familiarisation) and invisible. Four game-relevant sound stimuli and three different HRTFs were tested across eight sessions on two separate days. In this study, we show data collected from fifteen seeing participants, which demonstrate an ability to localise the sound sources accurately. The tracking data suggests various search patterns (e.g. hemisphere swaps and direction reversals) associated with ‘weak’ localisation cues and possible ambiguities. The search patterns are likely all quantifiable via angular error, response time, path length, search directions, number of reversals, and search speed as determined from the tracking characteristics.

Speakers

3:00pm CEST

Coffee

Thursday July 2, 2026 3:00pm - 3:30pm CEST

IRCAM:Gallery

Thursday July 2, 2026 3:00pm - 3:30pm CEST
IRCAM:Gallery 1, place Igor Stravinsky Paris 4e

Social event

AES 2026 AVARIG Conference

9:30am CEST

12:30pm CEST

1:30pm CEST

Vlad Paul

Philip Nelson

1:30pm CEST

Kentaro Fujii

Ryugo Kijima

1:30pm CEST

Alexandre Philippon

Loïc Reboursière

Thierry Dutoit

1:30pm CEST

Samuel Hobern

Alan Archer-Boyd

Damian Murphy

1:30pm CEST

Nils Marggraf-Turley

Niels Pontoppidan

Lorenzo Picinali

1:30pm CEST

Max Væhrens

Stefania Serafin

Flemming Christensen

Dorte Hammershøi

3:00pm CEST

Get help with the event