Loading…
Venue: Hall F clear filter
arrow_back View All Dates
Friday, May 23
 

9:00am CEST

How to create and use audio for accessible video games?
Friday May 23, 2025 9:00am - 10:00am CEST
Sound is one of the most powerful tools for accessibility in video games, enabling players with visual impairments or cognitive disabilities to navigate, interact, and fully engage with the game world. This panel will explore how sound engineers can leverage audio design to enhance accessibility, making games more inclusive without compromising artistic intent. Experts from different areas of game development will discuss practical approaches, tools, and case studies that showcase how audio can bridge gaps in accessibility.

Discussion Topics:

• Why is sound crucial for accessibility in video games? Audio cues, spatial sound, and adaptive music can replace or complement visual elements, guiding players with disabilities through complex environments and interactions.
• Designing effective spatial audio for navigation and interaction. Using 3D audio and binaural rendering to provide players with intuitive sound-based navigation, enhancing orientation and gameplay flow for blind or visually impaired users.
• Audio feedback and sonification as key accessibility tools. Implementing detailed auditory feedback for in-game actions, menu navigation, and contextual cues to improve usability and player experience.
• Case studies of games with exemplary accessible audio design. Examining how games like The Last of Us Part II, BROK: The InvestiGator, and other titles have successfully integrated sound-based accessibility features.
• Tools and middleware solutions for accessible sound design (example: InclusivityForge). Showcasing how game engines and plugins such as InclusivityForge can streamline the implementation of accessibility-focused audio solutions.
• Challenges in designing accessible game audio and overcoming them. Addressing common technical and creative challenges when designing inclusive audio experiences, including balancing accessibility with immersive design.
• Future trends in accessibility-driven audio design. Exploring how AI, procedural sound, and new hardware technologies can push the boundaries of accessibility in interactive audio environments.

Panel Guests:

• Dr Joanna Pigulak - accessibility expert in games, researcher specializing in game audio accessibility, assistant professor at the Institute of Film, Media, and Audiovisual Arts at UAM.
• Tomasz Tworek - accessibility consultant, blind gamer, and audio design collaborator specializing in improving audio cues and sonification in video games.
• Dr Tomasz Żernicki - sound engineer, creator of accessibility-focused audio technologies for games, and founder of InclusivityForge.

Target Audience:

• Sound engineers and game audio designers looking to implement accessibility features in their projects.
• Game developers interested in leveraging audio as a tool for accessibility.
• UX designers and researchers focusing on sound-based interaction in gaming.
• Middleware and tool developers aiming to create better solutions for accessible audio design.
• Industry professionals seeking to align with accessibility regulations and best practices.

This panel discussion will explore how sound engineers can enhance game accessibility through innovative audio solutions, providing insights into the latest tools, design techniques, and industry best practices.
Speakers
avatar for Tomasz Żernicki

Tomasz Żernicki

co-founder, my3DAudio
Tomasz Zernicki is co-founder and former CEO of Zylia (www.zylia.co), an innovative company that provides tools for 3D audio recording and music production.Additionally, he is a founder of my3DAudio Ventures, whose goal is to scale audio companies that reach the MVP phase and want... Read More →
Friday May 23, 2025 9:00am - 10:00am CEST
Hall F ATM Studio Warsaw, Poland

9:30am CEST

Education & Career Fair
Friday May 23, 2025 9:30am - 11:30am CEST
Friday May 23, 2025 9:30am - 11:30am CEST
Hall F ATM Studio Warsaw, Poland

10:15am CEST

Fast facts on room acoustics
Friday May 23, 2025 10:15am - 11:45am CEST
If you are considering establishing a room for sound, i.e., recording, mixing, editing, listening, or even a room for live music, this is the crash course to attend!
Initially, we’ll walk through the essential considerations for any design of an acoustic space, (almost) no matter the purpose: Appropriate reverberation time, appropriate sound distribution, low background noise, no echoes/flutter echoes, appropriate control of early reflections, (and for stereo/surround/immersive: a degree of room symmetry).
To prevent misunderstandings, we must define the difference between room acoustics and building acoustics. This is a tutorial on room acoustics! Finding the right reverberation time for a project depends on the room's purpose. We’ll look into some relevant standards to find an appropriate target value and pay attention to the importance of the room's frequency balance, especially at low frequencies! We will take the starting point for calculation using Sabine’s equation and discuss the conditions to make it work.
The room's shape, the shape’s effect on room modes, and the distribution of the modes are mentioned (together with the term Schroeder Frequency). The acoustical properties of some conventional building materials and the consequences of choosing one in favor of another for the basic design are discussed. The membrane absorbers (plasterboard, plywood, gypsum board) and their importance in proper room design are presented here. This also involves the definition of absorption coefficients (and how to get them).
From the “raw” room and its properties, we move on to define the acoustic treatment to reach the target value. Again, the treatment often can be cheaper building materials. However, a lot of expensive specialized materials are also available. We’ll try to find a way through the jungle, keeping an eye on the spending. The tools typically are porous absorbers for the smaller rooms. Sometimes, resonance absorbers are used for larger rooms. We don’t want overkill of the high frequencies!
The placement of the sound sources in the room influences the perceived sound. A few basic rules are given. Elements to control the sound field are discussed: Absorption vs. diffusion. Some more uncomplicated principles for DYI diffusers are shown.
During the presentation, various practical solutions are presented. At the end of the tutorial, there will be some time for a minor Q&A.
Speakers
avatar for Eddy B. Brixen

Eddy B. Brixen

consultant, EBB-consult
Eddy B. Brixenreceived his education in electronic engineering from the Danish Broadcasting Corporation, the Copenhagen Engineering College, and the Technical University of Denmark. Major activities include room acoustics, electro-acoustic design, and audio forensics. He is a consultant... Read More →
Friday May 23, 2025 10:15am - 11:45am CEST
Hall F ATM Studio Warsaw, Poland

12:00pm CEST

Acoustic analysis on ancient stadia: from Circus Maximum of Rome to Hippodrome of Constantinople
Friday May 23, 2025 12:00pm - 1:30pm CEST
#N/A
Speakers
AB

Antonella Bevilacqua

University of Parma
Friday May 23, 2025 12:00pm - 1:30pm CEST
Hall F ATM Studio Warsaw, Poland

12:00pm CEST

Adaptive Room Acoustics Optimisation Using Virtual Microphone Techniques
Friday May 23, 2025 12:00pm - 1:30pm CEST
Room acoustics optimisation in live sound environments using signal processing techniques has captivated the minds of audio enthusiasts and researchers alike for over half a century. From analogue filters in the 1950s, to modern research efforts such as room impulse response equalisation and adaptive sound field control, this subject has exploded to life. Controlling the sound field in a static acoustic space is complex due to the high number of system variables, such as reflections, speaker crosstalk, equipment-induced coloration, room modes, reverberation, diffraction and listener positioning. These challenges are further amplified by dynamic variables such as audience presence, environmental conditions and room occupancy changes, which continuously and unpredictably reshape the sound field.
A primary objective of live sound reinforcement is to deliver uniform sound quality across the audience area. This is most critical at audience ear level, where tonal balance, clarity, and spatial imaging are most affected by variations in the sound field. While placing microphones at audience ear level positions could enable real-time monitoring, large-scale deployment is impractical due to audience interference.
This research will explore the feasibility of an adaptive virtual microphone-based approach to room acoustics optimisation. By strategically placing microphone arrays and leveraging virtual microphone technology, the system estimates the sound field dynamically at audience ear level without requiring physical microphones. By continuously repositioning focal points across listening zones, a small number of arrays could effectively monitor large audience areas. If accurate estimations can be achieved, real-time sound field control becomes more manageable and effective.
Speakers
avatar for Gavin Kearney

Gavin Kearney

Professor of Audio Engineering, University of York
Gavin Kearney graduated from Dublin Institute of Technology in 2002 with an Honors degree in Electronic Engineering and has since obtained MSc and PhD degrees in Audio Signal Processing from Trinity College Dublin. He joined the University of York as Lecturer in Sound Design in January... Read More →
Friday May 23, 2025 12:00pm - 1:30pm CEST
Hall F ATM Studio Warsaw, Poland

12:00pm CEST

Analysis of the Sound Pressure Level Distribution in the Low-Frequency Range Below the First Modal Frequency in Small Room Acoustics
Friday May 23, 2025 12:00pm - 1:30pm CEST
The occurrence of eigenmodes is one of the fundamental phenomena in the acoustics of small rooms. The modes formation results in an uneven distribution of the sound pressure level in the room. To determine the resonance frequencies and their distributions, numerical methods, analytical methods or experimental studies are used. For the purpose of this paper, an experimental study was carried out in a small room. The study analysed the results of measuring the sound pressure level distributions in the room, with a special focus on the frequency range 20 Hz - 32 Hz, below the first modal frequency in the room. The measurement were conducted in the rectangular grid 9x9 microphones, which resulted in 0.5 m microphones grid resolution. The influence of evanescent modes on the total sound field was investigated. The research takes into account several sound source locations. On the basis of the acoustic measurement carried out, frequency response curves were also plotted. This paper presents a few methods for analysing these curves based on standard deviation, the linear least squares method, coefficient of determination R^2 and root mean squared error (RMSE). The results obtained made it possible to determine the best position of the acoustic source in the room under study. The effect of evanescent modes on the total sound field was also observed.
Friday May 23, 2025 12:00pm - 1:30pm CEST
Hall F ATM Studio Warsaw, Poland

12:00pm CEST

Diffuse Signal Processing (DiSP) as a Method of Decorelating a Stereo Mix to Increase Mono Compatibility
Friday May 23, 2025 12:00pm - 1:30pm CEST
Mono compatibility is a fundamental challenge in audio production, ensuring that stereo mixes retain clarity, balance, and spectral integrity when summed to mono. Traditional stereo widening techniques often introduce phase shifts, comb filtering, and excessive decorrelation, causing perceptual loss of critical mix elements in mono playback. Diffuse Signal Processing (DiSP) is introduced as a convolution-based method that improves mono compatibility while maintaining stereo width.

This study investigates the application of DiSP to the left and right channels of a stereo mix, leveraging MATLAB-synthesized TDI responses to introduce spectrally balanced, non-destructive acoustic energy diffusion. TDI convolution is then applied to both the left and right channels of the final stereo mix.

A dataset of stereo mixes from four genres (electronic, heavy metal, orchestral, and pop/rock) was analyzed. The study evaluated phase correlation, mono-summed frequency response deviation and amount of comb filtering to quantify improvements in mono summation. Spectral plots and wavelet transforms provided objective analysis. Results demonstrated that DiSP reduced phase cancellation, significantly decreased comb filtering artifacts, and improved spectral coherence in mono playback while preserving stereo width within the original mix. Applying this process to the final left and right channels allows an engineer to mix freely without the concern of the mono mix’s compatibility.

DiSP’s convolution-based approach offers a scalable, adaptive solution for modern mixing and mastering workflows, overcoming the limitations of traditional stereo processing. Future research includes machine learning-driven adaptive DiSP, frequency-dependent processing enhancements, and expansion to spatial audio formats (5.1, 7.1, Dolby Atmos) to optimize mono downmixing. The findings confirm DiSP as a robust and perceptually transparent method for improving mono compatibility without compromising stereo imaging.
Speakers
TS

Tommy Spurgeon

Physics Student & Undergraduate Researcher, University of South Carolina
Friday May 23, 2025 12:00pm - 1:30pm CEST
Hall F ATM Studio Warsaw, Poland

12:00pm CEST

Instantaneous Low-frequency Energetic Analysis for Detection of Standing Waves
Friday May 23, 2025 12:00pm - 1:30pm CEST
Standing waves are a phenomenon ever-present in the reproduction of low frequencies and have a direct impact on the auditory perception of this frequency region.
This study addresses the challenges posed by standing waves which are difficult to measure accurately using conventional pressure microphones, due to their spatial and temporal characteristics. To combat these issues, a state-of-the-art sound pressure velocity probe specifically designed for measurement of intensity in the low-frequency spectrum is developed. Using this probe, the research includes the development of new energy estimation parameters to better quantify the characteristics of sound fields influenced by standing waves. Additionally, a novel "standing-wave-ness" parameter is proposed, based on two diffuseness quantities dealing with the proportion of locally confined energy and the temporal variation of the intensity vectors. The performance of the new method and probe is evaluated through both simulated and real-world measurement data. Simulations provide a controlled environment to assess the method's accuracy across a variety of scenarios, including both standing wave and non-standing wave conditions. These initial simulations are followed by validation through measurement data obtained from an anechoic chamber, ensuring that the method's capabilities are tested in highly controlled, close-to-real-world settings. Preliminary results from this dual approach show promising potential for the new method to quantify the presence of standing waves, adding a new dimension in the visualisation and understanding of low-frequency phenomena.
Speakers
avatar for Madalina Nastasa

Madalina Nastasa

Doctoral Researcher, Aalto University
Doctoral researcher at the Acoustics Lab of Aalto University passionate about everything audio. My research focuses on the human perception of the very low frequency spectrum, and so does my day to day life. When I am not in the Acoustics lab, I organise electronic music events where... Read More →
avatar for Aki Mäkivirta

Aki Mäkivirta

R&D Director, Genelec Oy
Aki Mäkivirta is R&D Director at Genelec, Iisalmi, Finland, and has been with Genelec since 1995. He received his Master of Science, Licentiate of Science, and Doctor of Science in Technology degrees from Tampere University of Technology, in 1985, 1989, and 1992, respectively.  Aki... Read More →
Friday May 23, 2025 12:00pm - 1:30pm CEST
Hall F ATM Studio Warsaw, Poland

1:45pm CEST

A Testbed for Detecting DeepFake Audio
Friday May 23, 2025 1:45pm - 3:45pm CEST
The rapid advancement of generative artificial intelligence has created highly realistic DeepFake multimedia content, posing significant challenges for digital security and authenticity verification. This paper presents the development of a comprehensive testbed designed to detect counterfeit audio content generated by DeepFake techniques. The proposed framework integrates forensic spectral analysis, numerical and statistical modeling, and machine learning-based detection to assess the authenticity of multimedia samples. Our study evaluates various detection methodologies, including spectrogram comparison, Euclidean distance-based analysis, pitch modulation assessment, and spectral flatness deviations. The results demonstrate that cloned and synthetic voices exhibit distinctive acoustic anomalies, with forensic markers such as pitch mean absolute error and power spectral density variations serving as effective indicators of manipulation. By systematically analyzing human, cloned, and synthesized voices, this research provides a foundation for advancing DeepFake detection strategies. The proposed testbed offers a scalable and adaptable solution for forensic audio verification, contributing to the broader effort of safeguarding multimedia integrity in digital environments.
Friday May 23, 2025 1:45pm - 3:45pm CEST
Hall F ATM Studio Warsaw, Poland

1:45pm CEST

An audio quality metrics toolbox for media assets management, content exchange, and dataset alignment
Friday May 23, 2025 1:45pm - 3:45pm CEST
Content exchange and collaboration serve as catalysts for repository creation that supports creative industries and fuels model development in machine learning and AI. Despite numerous repositories, challenges persist in discoverability, rights preservation, and efficient reuse of audiovisual assets. To address these issues, the SCENE (Searchable multi-dimensional Data Lakes supporting Cognitive Film Production & Distribution for the Promotion of the European Cultural Heritage) project introduces an automated audio quality assessment toolkit integrated within its Media Assets Management (MAM) platform. This toolkit comprises a suite of advanced metrics, such as artifact detection, bandwidth estimation, compression history analysis, noise profiling, speech intelligibility, environmental sound recognition, and reverberation characterization. The metrics are extracted using dedicated Flask-based web services that interface with a data lake architecture. By streamlining the inspection of large-scale audio repositories, the proposed solution benefits both high-end film productions and smaller-scale collaborations. The pilot phase of the toolkit will involve professional filmmakers who will provide feedback to refine post-production workflows. This paper presents the motivation, design, and implementation details of the toolkit, highlighting its potential to assess content quality management and contribute to more efficient content exchange in the creative industries.
Speakers
avatar for Nikolaos Vryzas

Nikolaos Vryzas

Aristotle University Thessaloniki
Dr. Nikolaos Vryzas was born in Thessaloniki in 1990. He studied Electrical & Computer Engineering in the Aristotle University of Thessaloniki (AUTh). After graduating, he received his master degrees on Information and Communication Audio Video Technologies for Education & Production... Read More →
IT

Iordanis Thoidis

Aristotle University of Thessaloniki
LV

Lazaros Vrysis

Aristotle University of Thessaloniki
Friday May 23, 2025 1:45pm - 3:45pm CEST
Hall F ATM Studio Warsaw, Poland

1:45pm CEST

Application for Binaural Audio Plays: Development of Auditory Perception and Spatial Orientation
Friday May 23, 2025 1:45pm - 3:45pm CEST
When navigating the environment, we primarily rely on sight. However, in its absence, individuals must develop precise spatial awareness using other senses. A blind person can recognize their immediate surroundings through touch, but assessing larger spaces requires auditory perception.
This project presents a method for auditory training in children with visual disabilities through structured audio plays designed to teach spatial pronouns and enhance spatial orientation via auditory stimuli. The format and structure of these audio plays allow for both guided learning with a mentor and independent exploration. Binaural recordings serve as the core component of the training exercises. The developed audio plays and their analyses are available on the YouTube platform in the form of videos and interactive exercises.
The next step of this project involves developing an application that enables students to create individual accounts and track their progress. Responses collected during exercises will help assess the impact of the audio plays on students, facilitating improvements and modifications to the training materials.
Additionally, linking vision-related questions with responses to auditory exercises will, over time, provide insights into the correlation between these senses. The application can serve multiple purposes: collecting research data, offering spatial recognition and auditory perception training, and creating a comprehensive, structured environment for auditory skill development.
Friday May 23, 2025 1:45pm - 3:45pm CEST
Hall F ATM Studio Warsaw, Poland

1:45pm CEST

Exploring the Process of Interconnected Procedurally Generated Visual and Audial Content
Friday May 23, 2025 1:45pm - 3:45pm CEST
This paper investigates the innovative synthesis of procedurally generated visual and auditory content through the use of Artificial Intelligence (AI) Tools, specifically focusing on Generative Pre-Trained Transformer (GPT) networks.
This research explores the process of procedurally generating an audiovisual representations of semantic context by generating images, artificially providing motion and generating corresponding multilayered sound. The process enables the generation of stopped-motion audiovisual representations of concepts.
This approach not only highlights the capacity for Generative AI to produce cohesive and semantically rich audiovisual media but also delves into the interconnections between visual art, music, sonification, and computational creativity. By examining the synergy between generated imagery and corresponding soundscapes, this research paper aims to uncover new insights into the aesthetic and technical implications of the use of AI in art.
This research embodies a direct application of AI technology across multiple disciplines creating intermodal media. Research findings propose a novel framework for understanding and advancing the use of AI in the creative processes, suggesting potential pathways for future interdisciplinary research and artistic expression.
Through this work, this study contributes to the broader discourse on the role of AI in enhancing creative practices, offering perspectives on how various modes of semantic representation can be interleaved using state-of-the-art technology.
Speakers
Friday May 23, 2025 1:45pm - 3:45pm CEST
Hall F ATM Studio Warsaw, Poland

1:45pm CEST

G.A.D.A.: Guitar Audio Dataset for AI - An Open-Source Multi-Class Guitar Corpus
Friday May 23, 2025 1:45pm - 3:45pm CEST
We present G.A.D.A. (Guitar Audio Dataset for AI), a novel open-source dataset designed for advancing research in guitar audio analysis, signal processing, and machine learning applications. This comprehensive corpus comprises recordings from three main guitar categories: electric, acoustic, and bass guitars, featuring multiple instruments within each category to ensure dataset diversity and robustness.

The recording methodology employs two distinct approaches based on instrument type. Electric and bass guitars were recorded using direct recording techniques via DI boxes, providing clean, unprocessed signals ideal for further digital processing and manipulation. For acoustic guitars, where direct recording was not feasible, we utilized multiple microphone configurations at various positions to capture the complete acoustic properties of the instruments. Both recording approaches prioritize signal quality while maintaining maximum flexibility for subsequent processing and analysis.

The dataset includes standardized recordings of major and minor chords played in multiple positions and voicings across all instruments. Each recording is accompanied by detailed metadata, including instrument specifications, recording equipment details, microphone configurations (for acoustic guitars), and chord information. The clean signals from electric instruments enable various post-processing applications, including virtual amplifier modeling, effects processing, impulse response convolution, and room acoustics simulation.

To evaluate G.A.D.A.'s effectiveness in machine learning applications, we propose a comprehensive testing framework using established algorithms including k-Nearest Neighbors, Support Vector Machines, Convolutional Neural Networks, and Feed-Forward Neural Networks. These experiments will focus on instrument classification tasks using both traditional audio features and deep learning approaches.

G.A.D.A. will be freely available for academic and research purposes, complete with documentation, preprocessing scripts, example code, and usage guidelines. This resource aims to facilitate research in musical instrument classification, audio signal processing, deep learning applications in music technology, computer-aided music education, and automated music transcription systems.

The combination of standardized recording methodologies, comprehensive metadata, and the inclusion of both direct-recorded and multi-microphone captured audio makes G.A.D.A. a valuable resource for comparative studies and reproducible research in music information retrieval and audio processing.
Friday May 23, 2025 1:45pm - 3:45pm CEST
Hall F ATM Studio Warsaw, Poland

4:00pm CEST

Binamix - A Python Library for Generating Binaural Audio Datasets
Friday May 23, 2025 4:00pm - 6:00pm CEST
The increasing demand for spatial audio in applications such as virtual reality, immersive media, and spatial audio research necessitates robust solutions for binaural audio dataset generation for testing and validation. Binamix is an open-source Python library designed to facilitate programmatic binaural mixing using the extensive SADIE II Database, which provides HRIR and BRIR data for 20 subjects. The Binamix library provides a flexible and repeatable framework for creating large-scale spatial audio datasets, making it an invaluable resource for codec evaluation, audio quality metric development, and machine learning model training. A range of pre-built example scripts, utility functions, and visualization plots further streamline the process of custom pipeline creation. This paper presents an overview of the library's capabilities, including binaural rendering, impulse response interpolation, and multi-track mixing for various speaker layouts. The tools utilize a modified Delaunay triangulation technique to achieve accurate HRIR/BRIR interpolation where desired angles are not present in the data. By supporting a wide range of parameters such as azimuth, elevation, subject IRs, speaker layouts, mixing controls, and more, the library enables researchers to create large binaural datasets for any downstream purpose. Binamix empowers researchers and developers to advance spatial audio applications with reproducible methodologies by offering an open-source solution for binaural rendering and dataset generation.
Speakers
avatar for Jan Skoglund

Jan Skoglund

Google
Jan Skoglund leads a team at Google in San Francisco, CA, developing speech and audio signal processing components for capture, real-time communication, storage, and rendering. These components have been deployed in Google software products such as Meet and hardware products such... Read More →
Friday May 23, 2025 4:00pm - 6:00pm CEST
Hall F ATM Studio Warsaw, Poland

4:00pm CEST

Neural 3D Audio Renderer for acoustic digital twin creation
Friday May 23, 2025 4:00pm - 6:00pm CEST
In this work, we introduce a Neural 3D Audio Renderer (N3DAR) - a conceptual solution for creating acoustic digital twins of arbitrary spaces. We propose a workflow that consists of several stages including:
1. Simulation of high-fidelity Spatial Room Impulse Responses (SRIR) based on the 3D model of a digitalized space,
2. Building an ML-based model of this space for interpolation and reconstruction of SRIRs,
3. Development of a real-time 3D audio renderer that allows the deployment of the digital twin of a space with accurate spatial audio effects consistent with the actual acoustic properties of this space.
The first stage consists of preparation of the 3D model and running the SRIR simulations using the state-of-the-art wave-based method for arbitrary pairs of source-receiver positions. This stage provides a set of learning data being used in the second stage - training the SRIR reconstruction model. The training stage aims to learn the model of the acoustic properties of the digitalized space using the Acoustic Volume Rendering approach (AVR). The last stage is the construction of a plugin with a dedicated 3D audio renderer where rendering comprises reconstruction of the early part of the SRIR, estimation of the reverb part, and HOA-based binauralization.
N3DAR allows the building of tailored audio rendering plugins that can be deployed along with visual 3D models of digitalized spaces, where users can freely navigate through the space with 6 degrees of freedom and experience high-fidelity binaural playback in real time.
We provide a detailed description of the challenges and considerations for each of the stages. We also conduct an extensive evaluation of the audio rendering capabilities with both, objective metrics and subjective methods using a dedicated evaluation platform.
Friday May 23, 2025 4:00pm - 6:00pm CEST
Hall F ATM Studio Warsaw, Poland

4:00pm CEST

Performance Estimation Method for 3D Microphone Array based on the Modified Steering Vector in Spherical Harmonic Domain
Friday May 23, 2025 4:00pm - 6:00pm CEST
This paper presents an objective method for estimating the performance of 3D microphone arrays, which is also applicable to 2D arrays. The method incorporates the physical characteristics and relative positions of the microphones, merging these elements through a weighted summation to derive the arrays' directional patterns. These patterns are represented as a "Modified Steering Vector." Additionally, leveraging the spatial properties of spherical harmonics, we transform the array's directional pattern into the spherical harmonic domain. This transformation enables a quantitative analysis of the physical properties of each component, providing a comprehensive understanding of the array's performance. Overall, the proposed method offers a deeply insightful and versatile framework for evaluating the performance of both 2D and 3D microphone arrays by fully exploiting their inherent physical characteristics.
Friday May 23, 2025 4:00pm - 6:00pm CEST
Hall F ATM Studio Warsaw, Poland

4:00pm CEST

Reconstructing Sound Fields with Physics-Informed Neural Networks: Applications in Real-World Acoustic Environments
Friday May 23, 2025 4:00pm - 6:00pm CEST
The reconstruction of sound fields is a critical component in a range of applications, including spatial audio for augmented, virtual, and mixed reality (AR/VR/XR) environments, as well as for optimizing acoustics in physical spaces. Traditional approaches to sound field reconstruction predominantly rely on interpolation techniques, which estimate sound fields based on a limited number of spatial and temporal measurements. However, these methods often struggle with issues of accuracy and realism, particularly in complex and dynamic environments. Recent advancements in deep learning have provided promising alternatives, particularly with the introduction of Physics-Informed Neural Networks (PINNs), which integrate physical laws directly into the model training process. This study aims to explore the application of PINNs for sound field reconstruction, focusing on the challenge of predicting acoustic fields in unmeasured areas. The experimental setup involved the collection of impulse response data from the Promenadikeskus concert hall in Pori, Finland, using various source and receiver positions. The PINN framework is then utilized to simulate the hall’s acoustic behavior, with parameters incorporated to model sound propagation across different frequencies and source-receiver configurations. Despite challenges arising from computational load, pre-processing strategies were implemented to optimize the model's efficiency. The results demonstrate that PINNs can accurately reconstruct sound fields in complex acoustic environments, offering significant potential for real-time sound field control and immersive audio applications.
Speakers
RK

Rigas Kotsakis

Aristotle University of Thessaloniki
IT

Iordanis Thoidis

Aristotle University of Thessaloniki
avatar for Nikolaos Vryzas

Nikolaos Vryzas

Aristotle University Thessaloniki
Dr. Nikolaos Vryzas was born in Thessaloniki in 1990. He studied Electrical & Computer Engineering in the Aristotle University of Thessaloniki (AUTh). After graduating, he received his master degrees on Information and Communication Audio Video Technologies for Education & Production... Read More →
LV

Lazaros Vrysis

Aristotle University of Thessaloniki
Friday May 23, 2025 4:00pm - 6:00pm CEST
Hall F ATM Studio Warsaw, Poland

4:00pm CEST

Recording and post-production of Dietrich Buxtehude baroque cantatas in stereo and Dolby Atmos using experimental 3D microphone array.
Friday May 23, 2025 4:00pm - 6:00pm CEST
3D recordings seem to be an attractive solution when trying to achieve the immersion effect. Recently, Dolby Atmos is an increasingly popular format for distributing three-dimensional music recordings. Although currently the main format for producing music recordings is still stereophony.

How to optimally extend traditional microphone techniques when recording classical music to obtain both stereo recordings and three-dimensional formats (e.g. Dolby Atmos) in the post-production process? The author is trying to answer this question using the example of a recording of Dietrich Buxtehude work "Membra Jesu Nostri", BuxWV 75. The cycle of seven cantatas composed in 1680 is one of the most important and most popular compositions of the early Baroque era. The first Polish recording was made by the Arte Dei Suonatori conducted by Bartłomiej Stankowiak, accompanied by soloists and choral parts performed by the choir Cantus Humanus.

The author will present his concept of a set of microphones for 3D recordings. In addition to the detailed setup of microphones, it will cover the method of post-production of the recording, combining stereo with a mix of the recording into the Dolby Atmos system in a 7.2.4 speaker configuration. A workflow will be proposed to facilitate the change between different formats.
Friday May 23, 2025 4:00pm - 6:00pm CEST
Hall F ATM Studio Warsaw, Poland

4:00pm CEST

Subjective Evaluation on Three-dimensional VBAP and Ambisonics in an Immersive Concert Setting
Friday May 23, 2025 4:00pm - 6:00pm CEST
This paper investigates the subjective evaluation of two prominent three-dimensional spatialization techniques—Vector Base Amplitude Panning (VBAP) and High-Order Ambisonics (HOA)—using IRCAM’s Spat in an immersive concert setting. The listening test was conducted in the New Hall at the Royal Danish Academy of Music, which features a 44-speaker immersive audio system. The musical stimuli included electronic compositions and modern orchestral recordings, providing a diverse range of temporal and spectral content. The participants comprised experienced Tonmeisters and non-experienced musicians, who were seated in off-center positions to simulate real-world audience conditions. This study provides an ecologically valid subjective evaluation methodology.
The results indicated that VBAP excelled in spatial clarity and sound quality, while HOA demonstrated superior envelopment. The perceptual differences between the two techniques were relatively minor, influenced by room acoustics and suboptimal listening positions. Furthermore, music genre had no significant impact on the evaluation outcomes.
The study highlights VBAP’s strength in precise localization and HOA's capability for creating immersive soundscapes, aiming to bridge the gap between ideal and real-world applications in immersive sound reproduction and perception. The findings suggest the need to balance trade-offs when selecting spatialization techniques for specific purposes, venues, and audience positions. Future research will focus on evaluating a wider range of spatialization methods in concert environments and optimizing them to improve the auditory experience for distributed audiences.
Speakers
avatar for Jesper Andersen

Jesper Andersen

Head of Tonmeister Programme, Det Kgl Danske Musikkonservatorium
As a Grammy-nominated producer, engineer and pianist Jesper has recorded around 100 CDs and produced music for radio, TV, theatre, installations and performance. Jesper has also worked as a sound engineer/producer at the Danish Broadcasting Corporation.A recent album-production is... Read More →
avatar for Stefania Serafin

Stefania Serafin

Professor, Aalborg University Copenhagen
I am Professor in Sonic interaction design at Aalborg University in Copenhagen and leader of the Multisensory Experience Labtogether with Rolf Nordahl.I am the President of the Sound and Music Computing association, Project Leader of the Nordic Sound and Music Computing netwo... Read More →
Friday May 23, 2025 4:00pm - 6:00pm CEST
Hall F ATM Studio Warsaw, Poland

4:00pm CEST

Visualization of the spatial behavior between channels in surround program
Friday May 23, 2025 4:00pm - 6:00pm CEST
#N/A
Speakers
Friday May 23, 2025 4:00pm - 6:00pm CEST
Hall F ATM Studio Warsaw, Poland

4:30pm CEST

Ask Us Anything About Starting Your Career
Friday May 23, 2025 4:30pm - 6:00pm CEST
Join a panel of professionals from a variety of fields in the industry as we discuss topics including how to enter the audio industry, how they each got started in their own careers and the path their careers took, and give advice geared towards students and recent graduates. Bring your questions for the panelists – most of this workshop will be focused the information YOU want to hear!
Speakers
avatar for Ian Corbett

Ian Corbett

Coordinator & Professor, Audio Engineering & Music Technology, Kansas City Kansas Community College
Dr. Ian Corbett is the Coordinator and Professor of Audio Engineering and Music Technology at Kansas City Kansas Community College. He also owns and operates off-beat-open-hats LLC, providing live sound, recording, and audio production services to clients in the Kansas City area... Read More →
Friday May 23, 2025 4:30pm - 6:00pm CEST
Hall F ATM Studio Warsaw, Poland
  Audio in education
 


Share Modal

Share this link via

Or copy link

Filter sessions
Apply filters to sessions.
Filtered by Date - 
  • Acoustic Transducers & Measurements
  • Acoustics
  • Acoustics of large performance or rehearsal spaces
  • Acoustics of smaller rooms
  • Acoustics of smaller rooms Room acoustic solutions and materials
  • Acoustics & Sig. Processing
  • AI
  • AI & Machine Audition
  • Analysis and synthesis of sound
  • Archiving and restoration
  • Audio and music information retrieval
  • Audio Applications
  • Audio coding and compression
  • Audio effects
  • Audio Effects & Signal Processing
  • Audio for mobile and handheld devices
  • Audio for virtual/augmented reality environments
  • Audio formats
  • Audio in Education
  • Audio perception
  • Audio quality
  • Auditory display and sonification
  • Automotive Audio
  • Automotive Audio & Perception
  • Digital broadcasting
  • Electronic dance music
  • Electronic instrument design & applications
  • Evaluation of spatial audio
  • Forensic audio
  • Game Audio
  • Generative AI for speech and audio
  • Hearing Loss Protection and Enhancement
  • High resolution audio
  • Hip-Hop/R&B
  • Impact of room acoustics on immersive audio
  • Instrumentation and measurement
  • Interaction of transducers and the room
  • Interactive sound
  • Listening tests and evaluation
  • Live event and stage audio
  • Loudspeakers and headphones
  • Machine Audition
  • Microphones converters and amplifiers
  • Microphones converters and amplifiers Mixing remixing and mastering
  • Mixing remixing and mastering
  • Multichannel and spatial audio
  • Music and speech signal processing
  • Musical instrument design
  • Networked Internet and remote audio
  • New audio interfaces
  • Perception & Listening Tests
  • Protocols and data formats
  • Psychoacoustics
  • Room acoustics and perception
  • Sound design and reinforcement
  • Sound design/acoustic simulation of immersive audio environments
  • Spatial Audio
  • Spatial audio applications
  • Speech intelligibility
  • Studio recording techniques
  • Transducers & Measurements
  • Wireless and wearable audio