Eclipsa Audio, based on the Immersive Audio Model and Format (IAMF) specification developed by members of the Alliance for Open Media, represents an open and royalty-free approach to immersive audio creation and delivery. Eclipsa Audio provides a growing ecosystem for producing and distributing spatial audio content, with hardware integration and streaming platform support, including YouTube, actively being rolled out. This panel brings together practitioners, researchers, and engineers directly involved in the development of IAMF and Eclipsa Audio to inform the audio engineering community about the current state of the format and its evolving toolkit for immersive audio production and delivery. The presenters will discuss how the Eclipsa Audio ecosystem can continue growing in the live and interactive realms, including 360 videos, streaming, gaming and the combination of both, eg. in e-sports. Future directions for development will also include developers' perspective on how Eclipsa Audio can be embraced by interactive environments.
Procedural audio, sometimes known as digital Foley, is the real-time and controllable generation of sound effects. It is an alternative to sourcing sound effects from vast libraries of pre-recorded samples. It may be used to have sounds adapt to the changing game state, and to dynamically generate all the sounds of a virtual world. However, there are cIRCAM:Galleryenges concerning the diversity of sounds that may be generated, the controllability of procedural audio models and the quality of the sounds that it produces. We address all of these aspects in this presentation. We showcase the opportunities that procedural audio offers and how the cIRCAM:Galleryenges can be surmounted, while providing demonstrations of these concepts. The session opens with an introduction to the presenters before moving into a broad review of procedural audio and its history in game sound design, covering core concepts, prior uses, and how the technology has developed over time. A video presentation accompanies this overview before the workshop turns to an honest examination of the key cIRCAM:Galleryenges facing the field: the diversity of sounds that can be generated, the controllability of procedural models, and the quality of their output. Recent advances tackling these limitations are then discussed, followed by live demonstrations of state-of-the-art procedural audio systems from Nemisindo, which generate dynamic, immersive soundscapes in real time. The session closes with an open questions and answers segment. Attendees will leave with practical insight into how procedural audio can enhance and expand the creative process for game sound designers, a clearer understanding of how to implement dynamic and adaptive sound in their own projects, hands-on exposure to interactive soundscape techniques, and concrete tips and tricks for improving their game audio practice. This session is suitable for sound designers, game developers, and anyone curious about the future of game audio, no prior knowledge of procedural audio is needed.
This workshop introduces the Binaural Rendering Toolbox (BRT), a set of open-source (GPLv3) software libraries, applications, and definitions aimed as a virtual laboratory for spatial psychoacoustic experimentation. The BRT provides a flexible and modular framework for binaural spatialisation, supporting multiple rendering models, including convolution-based and geometric approaches, as well as advanced features such as source directivity, several room acoustics models, individual HRTFs, BRIRs, near-field simulation, and real-time control via OSC.
This work investigates how audio accessibility in games can be reconceptualized as a structured and immersive auditory interaction paradigm, rather than a collection of discrete assistive cues. While existing approaches have improved access to gameplay information for visually impaired players, they remain largely event-driven and fragmented, often presenting auditory signals as isolated notifications. Such approaches may limit perceptual continuity and fail to reflect the dynamic, layered nature of interactive environments. The proposed system introduces an auditory information architecture that organizes gameplay information into four continuous layers: navigation, interaction, salience, and environment. Each layer represents a distinct yet interrelated perceptual function. Navigation encodes spatial orientation and movement, interaction conveys player actions and system responses, environment reflects ambient and contextual information, and salience integrates perceptually relevant events—including hazards, state transitions, and attention-driven signals—into a unified and context-sensitive layer. By structuring auditory output across these layers, gameplay information is represented as continuously evolving auditory processes rather than discrete cues. The system is implemented using Unreal Engine and Audiokinetic Wwise, with Max/MSP and RNBO used to extend real-time audio processing. Rather than prioritizing novelty through fully generative synthesis, the approach focuses on transforming and reorganizing existing sound materials through continuous parameter mapping. This enables adaptive auditory behavior while maintaining perceptual clarity and consistency with the game’s sonic identity. Interaction design is extended through a multi-source input framework. A camera-based input layer combines webcam-based motion capture with analysis of screen-mediated interactions, including controller inputs (e.g., mouse, gamepad, touch) and player-character movement within the game environment. These inputs are translated into perceptual features and mapped to auditory parameters, forming a bidirectional interaction loop in which player behavior directly influences auditory output. A user study is planned to evaluate the effectiveness of the proposed system in non-visual navigation tasks. The study will compare the layered auditory architecture with conventional cue-based approaches. Evaluation metrics will include objective measures such as navigation accuracy and task completion time, as well as subjective measures including perceived spatial awareness, perceptual continuity, and immersion. In addition, the study will examine the impact of camera-based interaction on engagement and perceived agency. The evaluation is designed to investigate whether continuous auditory representation improves coherence between auditory feedback and gameplay experience. A central contribution of this work lies in the aesthetic integration of accessibility. Rather than functioning as an external assistive layer, accessibility-oriented audio is embedded within the core sound design. Informational signals emerge through transformations of existing sound materials, allowing perceptual clarity to be achieved without disrupting immersion. This reframes audio accessibility as an integral component of auditory interaction design. From a practical perspective, the system is structured as a modular and parameter-driven framework, allowing scalable implementation across different platforms. Potential constraints related to computational load—particularly in real-time processing and camera-based input—are considered, with an emphasis on efficient parameter mapping and system optimization for resource-limited environments such as mobile and virtual reality.
Sound engineers doing live mixing for theatre must manage the balance between on-stage acoustic sources and electroacoustic sounds diffused in the IRCAM:Gallery, triggering, spatialising and mixing pre-recorded and live sounds while actors perform. Typically confined to the control booth of unfamiliar venues, they need to adapt to a listening perspective that differs significantly from the audience's experience in the stalls or the balconies. This work engages with sound studies, virtual acoustics, and archival practices to investigate complementary questions. How does the acoustic dissociation between the control booth and the rest of the venue influence the technical and aesthetic decisions of sound engineers? How would contemporary engineers interpret archived theatrical soundtracks when guided by annotated scripts? To address these questions, the research unfolds in four stages: capture multiple High Order Ambisonics Impulse Responses from emblematic theatres in São Paulo, Brazil, combining flexible sources setups and multiple listening positions; use them to build a real-time convolution engine, integrating the IRs with actors' voices and archival soundtracks from the collection of Brazilian theatrical sound designer Tunica Teixeira; invite sound engineers to perform mixing tasks in a virtual acoustic environment, guided by Tunica's annotated scripts; use the task metrics and structured questionnaires to assess the impact of multi-perspective listening on their technical and aesthetic decisions.
Navigation tasks are often used as a fun and engaging method of exploring and interacting with video game environments. 3D open-world games afford curiosity-driven navigation for players, providing opportunities to follow their agency and interact with points of interest within an environment. However, this is commonly a visually-motivated task that is seldom accessible to Blind and Low Vision (BLV) gamers. Given the impact of this barrier, it is imperative to design navigation systems for games that are driven by auditory information to provide equal opportunities for BLV gamers to engage with open world game environments. There is a noted lack of understanding from game developers currently that evidences the need for dialogue between researchers, BLV gamers and developers. In collaboration with both BLV gamers and developers, we present the early, first co-designed prototypes for a customisable, Blind-accessible auditory navigation toolkit in 3D open-world video game environments. We build on a series of dialogic discussions with Disabled gamers who have experienced barriers in their gameplay experiences and preset three navigation tools. We document the design of these tools and present theme explorations from analysis both each co-design phases. We present discussions on including player agency, action precision, gameplay fluidity and cognitive load, categorisation and identification, sound preference, and tutorialisation and learnability. From these themes, we derive design insights that highlight the barriers and considerations for auditory navigation in video games.
Pleyel.exe is an interactive documentary presented as a video game, exploring the evolving landscape of the Carrefour Pleyel district in Saint-Denis. Through free navigation within immersive 3D scans generated from gaussian splatting, visitors can wander through sites in transition. As they explore, they encounter residents’ testimonies, drawn from in-situ recorded and carefully edited interviews, offering personal perspectives on the neighborhood and its ongoing transformations.
Using pitch, delay and modulation effects to perceptually spread a mono source across additional, adjacent playback channels is a staple of music production, born in stereo and extended to immersive. This tutorial begins with the historical origins of the effect in the mid 1970s, showing the signal processing chains used, measuring its impact on the signal, discussing its psychoacoustic merit, and demonstrating the resulting sound. The evolution from stereo through surround sound to immersive formats using contemporary production tools and techniques is demonstrated. The effect is still evolving as new tools are developed and creators explore what is possible across all immersive artforms – music, AR/VR, and games. It is hoped a deep dive into the first 50 years of this effect will inform and inspire immersive mixers for the future.
Immersive audio production for XR, virtual environments, and live performance is increasingly defined by a diversity of spatial formats and rendering systems, including Ambisonics and object-based approaches. While these enable complex spatial experiences, they also result in fragmented workflows and limited interoperability across production and playback contexts. This workshop explores spatial audio as a flexible and transferable practice rather than a system-bound process. It introduces Grapes 3D Audio Control as a system-independent control approach that enables users to work with spatial audio across different environments without being tied to a specific rendering pipeline. Participants will engage in hands-on exercises to create, control, and adapt spatial audio scenes across multiple contexts, including XR applications, media installations, and live setups. The focus lies on maintaining spatial intent while working across heterogeneous systems and technical conditions. The workshop combines practical exploration with short demonstrations and structured discussion. It explicitly creates space for exchange on different workflows and production strategies, bringing together perspectives from sound design, audio engineering, live operation, and XR development. By focusing on interoperability, workflow design, and real-world application, the workshop aims to provide participants with practical strategies for working with spatial audio across systems, while contributing to a broader discussion on how immersive audio production can become more flexible, portable, and sustainable.
A demonstration of a newly developed network spatial audio engine and its client software to show how an object-based audio performance can presented simultaneously locally and in a virtual venue. I’ll play multiple tracks of audio and position data from a laptop, as a surrogate for a local performance, and stream this object-based audio into a 6 DoF virtual audio space. Then show how a remote audience can join the same audio space via web browsers, listen to the music and explore the space. And finally I’ll show streaming back a resolved ambisonic mix of the original performance and the sounds of the remote audience into the auditorium and play it out. I'll walk it through and show how all audio and data transfer is done with simple data structures and standard non-proprietary, streaming protocols.
Modern auditory rehabilitation faces significant cIRCAM:Galleryenges in speech discrimination within complex, noisy acoustic environments. The use of Augmented Reality interfaces based on "virtual sound objects" proposes the separation and selective enhancement of audio sources, while the Auracast standard (Bluetooth LE Audio) emerges as the ideal mechanism to distribute these independent streams with low latency. However, the advancement of such selective listening strategies is strictly limited by proprietary commercial ecosystems and a complete lack of open-source research platforms that adhere to the physical and power constraints of wearable devices. To bridge this gap, this work develops an open-source Auracast application on the Tiresias open-hardware platform, establishing an accessible "front-end" infrastructure for auditory interaction. The architecture was implemented on the Nordic nRF5340 SoC utilizing the Zephyr RTOS. Preliminary evaluations on a development kit successfully validated the protocol stack integration, demonstrating stream stability. Ongoing work focuses on porting the firmware to the Tiresias board and integrating the ADAU1787 audio codec, aiming to empirically quantify the end-to-end latency and energy efficiency of the embedded system.
While Virtual Reality offers transformative potential for immersive storytelling, the heavy reliance on visual stimuli often excludes Blind and Visually Impaired audiences. Conventional accessibility methods, such as linear Audio Description, frequently struggle to keep pace with the non-linear, explorative nature of virtual environments, resulting in an "accessibility chasm" where traditional two-dimensional solutions fail to support non-visual navigation. This research addresses these limitations through a User-Centred Design approach, centred on the thematic analysis of semi-structured focus groups involving twelve experienced Blind and Visually Impaired videogame players from the Royal National Institute of Blind People. The inquiry explored four themes: spatial sound navigation, audio description integration, haptic efficacy, and the social dimensions of virtual interfaces. Findings indicate that non-visual spatial exploration requires a multifaceted auditory system utilizing 3D-sound, predictable sound effects, and abstract sound signifiers, paired with a hybrid audio description model balancing functional and affective narration. To mitigate the risk of cognitive overload, participants identified haptic feedback as a critical tool for tactile confirmation and attentional guidance, serving as a non-auditory anchor that complements the primary soundscape. These user-led insights and real life examples seen on accessible video games inform the development of the ‘Description Spheres’: interactive virtual objects embedded within virtual environments that serve as multi-sensory hubs. By integrating spatialized audio, localized haptics, and experimental audio description, the system enables a transition to a dynamic, exploratory model that translates complex visual-spatial data into intuitive, non-visual sensory ecosystems, offering a scalable blueprint for inclusive design.