Loading…
Schedule as of May 2026 - subject to change

Default Time Zone is EDT - Eastern Daylight Time


Type: Perception clear filter
Friday, July 3
 

11:00am CEST

(P) Perceptual Evaluation of Higher-Order Ambisonic Codecs on Both Synthetic Mixing and Native Recordings
Friday July 3, 2026 11:00am - 12:30pm CEST
Spatial audio is spreading in applications such as virtual and augmented reality and immersive games. The higher-order ambisonic (HOA) format is particularly useful in this context. Transmitting spatial information requires multiple channels, e.g., 16 channels for third-order ambisonics, resulting in increased memory requirements for storage and higher bitrates for communication. Therefore, efficient compression algorithms are necessary for those contents. The recently standardized IVAS codec allows the coding of HOA content for communication use-cases. Here, we propose to evaluate it in comParison with a basic multi-mono approach across a variety of contents and spatialization methods. Results show that IVAS outperforms the multi-mono approach at the same bitrate. In particular, this codec exploits inter-channel correlation to reduce the bitrate. We point out that it is therefore especially robust for signals with a high interchannel correlation, such as those composed of a limited number of plane waves. Conversely, the multi-mono approach is unable to exploit this correlation and performs poorly on this type of signal.
Friday July 3, 2026 11:00am - 12:30pm CEST
IRCAM:ESPRO (HOA) 1, place Igor Stravinsky Paris 4e

1:30pm CEST

(P) Acoustic and Perceptual Evaluation of Integrated Near-Ear Speakers vs. Over Head Headphones in VR Environments
Friday July 3, 2026 1:30pm - 4:00pm CEST
Virtual reality (VR) technologies have become increasingly widespread, extending beyond their traditional military and professional training applications to areas such as education, simulation, gaming, and entertainment. Most modern VR headsets are equipped with built-in near-ear speakers, commonly called nearphones. Between conventional headphones and loudspeakers, these devices and nearphones offer a convenient and lightweight audio solution without physically enclosing the ear. However, their impact on spatial audio perception and localization performance remains underexplored. This study explores how nearphones integrated into head-mounted displays (HMDs) perform relative to traditional headphones, focusing on identifying the specific acoustic and perceptual factors that enhance or hinder immersive audio experiences in virtual reality. Using the Oculus Quest 3 as a test platform, the research was divided into two parts. First, the frequency response of both headphones and earphones was measured to assess differences in sound quality. Second, a VR first-person shooter game was developed in Unreal Engine to evaluate sound localization. Participants identified targets based on audio cues alone, and performance metrics such as target accuracy and reaction times were collected to compare localization effectiveness. Besides localization accuracy, this research explored which users prefer audio devices. The results suggest that while traditional headphones generally offer more accurate spatial localization, nearphones provide greater comfort and convenience, highlighting a trade-off between acoustic precision and user ergonomics in VR applications.
Speakers
Friday July 3, 2026 1:30pm - 4:00pm CEST
IRCAM:Gallery 1, place Igor Stravinsky Paris 4e

1:30pm CEST

(P) Increasing Accessibility of Auditory Research: A 6-DoF Motion-Capture-Based Interface for Localisation Testing
Friday July 3, 2026 1:30pm - 4:00pm CEST
Perceptual evaluation of auditory localisation typically relies on graphical user interfaces, pointing devices, or touch screens to capture listener responses. These modalities implicitly require functional vision and/or manual dexterity, excluding participation of, for instance, people with visual impairments. This paper presents a solution for absolute sound-source localisation testing that uses head rotation, tracked by a six-degrees-of-freedom (6-DoF) optical motion-capture system as the response interface and relies solely on auditory cues for calibration and pointing. The paradigm builds on the natural coupling between auditory spatial attention and head orientation. Individual systematic bias is characterised via a mandatory training block in which stimuli are presented at discrete loudspeaker positions. A per-participant linear regression fitted to head-centred training responses provides a bias model that is applied to main-experiment trials, enabling decomposition of localisation error (LE) into constant error (CE, reflecting accuracy) and random error (RE, reflecting precision), following the established accuracy--precision framework for spatial hearing assessment. The specific use case simulates off-sweet-spot listening positions to inform development of a renderer aimed at enhancing the experience of visually impaired audiences consuming audio-described broadcast content. Preliminary data from the control group consisting of sighted participants are presented. The interface design, calibration procedure, and analysis pipeline are offered as a contribution towards inclusive spatial audio evaluation practice.
Friday July 3, 2026 1:30pm - 4:00pm CEST
IRCAM:Gallery 1, place Igor Stravinsky Paris 4e

1:30pm CEST

(P) The impact of audio spatialisation reproduction on the neurophysiological responses of music listeners
Friday July 3, 2026 1:30pm - 4:00pm CEST
Research into listeners’ emotional experience of different audio formats heavily relies on subjective, self-report measures. However, little is known about neural and physiological responses. As such, this feasibility study utilised electroencephalography (EEG), Heart Rate (HR) and Galvanic Skin Response (GSR), to explore the objective neurophysiological impacts of mono, stereo and spatial audio formats, across different music genres. In a within-subjects design, participants listened to 27 randomised stimuli, each comprising of a 30 second music excerpt across the three audio formats. Results were not significant but trends did arise in the data. While mono formats were shown to elevate cognitive load and arousal, spatial audio elicited a decrease in physiological arousal, promoting a more relaxed state. However, the effects overall were very genre-dependent. Differences in physiological response between static and dynamic spatial reproduction of different music genres are discussed. While limited by the lack of subjective validation and sample size, this study highlights interesting relationships between audio format and the physiological responses of music listeners.
Friday July 3, 2026 1:30pm - 4:00pm CEST
IRCAM:Gallery 1, place Igor Stravinsky Paris 4e

1:30pm CEST

(P) Virtualising SPHERE: active listening in 3D sound localisation
Friday July 3, 2026 1:30pm - 4:00pm CEST
Spatial hearing emerges from the integration of auditory, multisensory, and motor information, and is enhanced in natural conditions through active listening, where head and body movements provide dynamic cues that improve localisation accuracy and perceptual stability. This principle is central to immersive audio research in Virtual and Augmented Reality (VR/AR), where binaural rendering based on Head-Related Transfer Functions (HRTFs) and room acoustic cues enables the reproduction of interaural, monaural, and distance information. Beyond acoustics, bodily engagement (i.e., reaching toward sound sources) further supports spatial adaptation. These technologies enable controlled experimental protocols for assessment and training in both normal-hearing and hearing-impaired populations. One such paradigm is SPHERE, originally developed to study three-dimensional sound localisation in ecologically valid conditions and later applied to training and rehabilitation, including for cochlear implant users. In its original implementation, participants localise sounds presented via a physically moved loudspeaker and respond either through active exploration or under static listening constraints, while head, eye, and hand movements are tracked to analyse localisation accuracy, motor behaviour, and search strategies. However, reliance on a human operator limits reproducibility and scalability. This work introduces a fully virtualised SPHERE implementation using an immersive binaural rendering framework, preserving the original spatial configuration while enabling real-time multimodal tracking. The system also evaluates the impact of HRTF individualisation by comparing generic and personalised filters. Performance is validated against the original loudspeaker-based paradigm to assess ecological validity. Preliminary results regarding the system’s effectiveness as a research and clinical tool will be presented at the conference.
Friday July 3, 2026 1:30pm - 4:00pm CEST
IRCAM:Gallery 1, place Igor Stravinsky Paris 4e
 
Share Modal

Share this link via

Or copy link

Filter sessions
Apply filters to sessions.