Loading…
Venue: Hall F clear filter
Thursday, May 22
 

9:00am CEST

Free Online Course “Spatial Audio - Practical Master Guide”
Thursday May 22, 2025 9:00am - 10:00am CEST
“Spatial Audio - Practical Master Guide” is a free online course on spatial audio content creation. The target group are persons who have basic knowledge on audio production but are not necessarily dedicated experts in the underlying technologies and aesthetics. “Spatial Audio - Practical Master Guide” will be released on the Acoucou platform chapter-by-chapter all through Spring 2025. Some course content is already available as a preview.

The course comprises a variety of audio examples and interactive content that allow for the learners to develop their skills in a playful manner. The entire spectrum from psychoacoustics via the underlying technologies to delivery formats is covered. The course’s highlights are the 14 case studies and step-by-step guides that provide behind-the-scenes information. Many of the course components are self-sufficient so that they can be used in isolation or be integrated into other educational contexts.

The workshop on “Spatial Audio - Practical Master Guide” will provide an overview of the course contents, and we will explain the educational concepts that the course is based on. We will demonstrate the look and feel of the course on the Acoucou platform by demonstrating a set of representative examples from the courseware and provide the audience with the opportunity to experience it themselves. The workshop will wrap up with a discussion of the contexts in which the course contents may be useful besides self-study.

Course contents:
Chapter 1: Overview (introduction, history of spatial, evolution of aesthetics in spatial audio)
Chapter 2: Psychoacoustics (spatial hearing, perception of reverberation)
Chapter 3: Reproduction (loudspeaker arrays, headphones)
Chapter 4: Capture (microphone arrays)
Chapter 5: Ambisonics (capture, reproduction, editing of ambisonic content)
Chapter 6: Storing spatial audio content
Chapter 7: Delivery formats

Case studies: Dolby Atmos truck streaming, fulldome, ikosahedral loudspeaker, spatial audio sound installation, spatial audio at Friedrichstadt Palast, spatial audio in the health industry, live music performance with spatial audio, spatial audio in automotive

Step-by-step guides: setting up your spatial audio workstation, channel-based production for music, dolby atmos mix for cinema, ambisonics sound production for 360 film, build your own ambisonic microphone array, interactive spatial audio

Links:
https://spatial-audio.acoucou.org/
https://acoucou.org/
Thursday May 22, 2025 9:00am - 10:00am CEST
Hall F ATM Studio Warsaw, Poland

10:00am CEST

Audio equipment : musical instrument design and application
Thursday May 22, 2025 10:00am - 12:00pm CEST
The evolution of musical instruments has been deeply influenced by advancements in audio equipment, allowing for the creation of musical instruments that bridge the gap between tradition and modern innovations. This paper highlights the integration of modern technologies such as digital signal processing (DSP), artificial intelligence (AI) and advanced materials into musical instruments to enhance functionality, sound quality and musicians experience at all level by examining the historical progress, design principles and modern innovations.

Major areas of focus include the roles of electronic components such as the pickups, sensors and wireless interfaces in improving the functionality of modern musical instruments, as well high-performance materials on durability and sustainability. The case study of digital pianos and the talking drum will provide practical insights into how these innovations are being implemented alongside the contrast. The paper further addresses challenges such as maintaining cultural authenticity of traditional instruments while integrating modern technology, issue of latency, accessibility for diverse users globally and sustainability concerns in manufacturing.
Speakers
Thursday May 22, 2025 10:00am - 12:00pm CEST
Hall F ATM Studio Warsaw, Poland

10:00am CEST

Hearing History: The Role of Acoustic Simulation in the Digital Reconstruction of the Wołpa Synagogue
Thursday May 22, 2025 10:00am - 12:00pm CEST
This paper presents a case study on the auralization of the lost wooden synagogue in Wołpa, digitally reconstructed using a Heritage Building Information Modelling (HBIM) framework for virtual reality (VR) presentation. The study explores how acoustic simulation can aid in the preservation of intangible heritage, focusing on the synagogue’s unique acoustics. Using historical documentation, the synagogue was reconstructed with accurate geometric and material properties, and its acoustics were analyzed through high-fidelity ray-tracing simulations.
A key objective of this project is to recreate the Shema Israel ritual, incorporating a historical recording of the rabbi’s prayers. To enable interactive exploration, real-time auralization techniques were optimized to balance computational efficiency and perceptual authenticity, aiming to overcome the trade-offs between simplified VR audio models and physically accurate simulations. This research underscores the transformative potential of immersive technologies in reviving lost heritage, offering a scalable, multi-sensory approach to preserving sacred soundscapes and ritual experiences.
Thursday May 22, 2025 10:00am - 12:00pm CEST
Hall F ATM Studio Warsaw, Poland

10:00am CEST

Real-Time Performer Switching in Chamber Music
Thursday May 22, 2025 10:00am - 12:00pm CEST
The article explores the innovative concept of interactive music, where both creators and listeners can actively shape the structure and sound of a musical piece in real-time. Traditionally, music is passively consumed, but interactivity introduces a new dimension, allowing for creative participation and raising questions about authorship and the listener's role. The project "Sound Permutation: A Real-Time Interactive Musical Experiment" aims to create a unique audio-visual experience by enabling listeners to choose performers for a chamber music piece in semi-real-time. Two well-known compositions, Edward Elgar's "Salut d’Amour" and Camille Saint-Saëns' "Le Cygne," were recorded by three cellists and three pianists in all possible combinations. This setup allows listeners to seamlessly switch between performers' parts, offering a novel musical experience that highlights the impact of individual musicians on the perception of the piece.

The project focuses on chamber music, particularly the piano-cello duet, and utilizes advanced recording technology to ensure high-quality audio and video. The interactive system, developed using JavaScript allows for smooth video streaming and performer switching. The user interface is designed to be intuitive, featuring options for selecting performers and camera views. The system's optimization ensures minimal disruption during transitions, providing a cohesive musical experience. This project represents a significant step towards making interactive music more accessible, showcasing the potential of technology in shaping new forms of artistic engagement and participation.
Speakers
avatar for Pawel Malecki

Pawel Malecki

Profesor, AGH University of Krakow
Thursday May 22, 2025 10:00am - 12:00pm CEST
Hall F ATM Studio Warsaw, Poland

10:00am CEST

The benefits, tradeoffs, economics and tradeoffs of standard and proprietary digital audio networks in DSP systems
Thursday May 22, 2025 10:00am - 12:00pm CEST
In the field of digital audio signal processing (DSP) systems, the choice between standard and proprietary digital audio networks (DANs) can significantly impact both functionality and performance. This abstract aims to explore the benefits, tradeoffs, and economic implications of these two approaches, providing a comprehensive comparison to aid in decision-making processes for audio professionals and system designers. The abstract emphasizes key benefits of A2B, AOIP and older proprietary currently adopted.

Conclusion
The choice between standard and proprietary digital audio networks in audio DSP systems involves a careful consideration of benefits, tradeoffs, and economic implications. Standards-based systems provide interoperability and cost-effectiveness, while proprietary solutions offer optimized performance and innovative features. Understanding these factors can guide audio professionals and system designers in making informed decisions that align with their specific needs and long-term goals.
Speakers
avatar for Miguel Chavez

Miguel Chavez

Strategic Marketing ProAudio, Analog Devices
Electrical and Mechanical Engineer Bachelor Degree from Universidad Panamericana in Mexico City. Master in Science in Music Engineering from University of Miami.EMBA from Boston UniversityWorked at Analog Devices developing DSP Software and Algorithms ( SigmaStudio ) for 17 years... Read More →
Thursday May 22, 2025 10:00am - 12:00pm CEST
Hall F ATM Studio Warsaw, Poland

10:00am CEST

The Sound Map of Białystok − From monophonic to immersive audio repository of urban soundscapes
Thursday May 22, 2025 10:00am - 12:00pm CEST
This paper presents an ongoing project that aims to document the urban soundscapes of the Polish city of Białystok. It describes the progress made so far, including the selection of sonic landmarks, the process of acquiring the audio recordings, and the design of the unique graphic user interface featuring original drawings. Furthermore, it elaborates on the ongoing efforts to extend the project beyond the scope of a typical urban soundscape repository. In the present phase of the project, in addition to monophonic recordings, audio excerpts are acquired in binaural and Ambisonic sound formats, providing listeners with an immersive experience. Moreover, state-of-the-art machine-learning algorithms are applied to analyze gathered audio recordings in terms of their content and spatial characteristics, ultimately providing prospective users of the sound map with some form of automatic audio tagging functionality.
Thursday May 22, 2025 10:00am - 12:00pm CEST
Hall F ATM Studio Warsaw, Poland

11:45am CEST

Best practices for wireless audio in live production
Thursday May 22, 2025 11:45am - 12:45pm CEST
Wireless audio, both mics and in-ear-monitors, has become essential in many live productions of music and theatre, but it is often fraught with uneasiness and uncertainty. The panel of presenters will draw on their varied experience and knowledge to show how practitioners can use best engineering practices to ensure reliability and performance of their wireless mic and in-ear-monitor systems.
Speakers
avatar for Bob Lee

Bob Lee

Applications Engineer / Trainer, RF Venue, Inc.
I'm a fellow of the AES, an RF and electronics geek, and live audio specialist, especially in both amateur and professional theater. My résumé includes Senhheiser, ARRL, and a 27-year-long tenure at QSC. Now I help live audio practitioners up their wireless mic and IEM game.I play... Read More →
Thursday May 22, 2025 11:45am - 12:45pm CEST
Hall F ATM Studio Warsaw, Poland

1:00pm CEST

Convention Opening Ceremony and AES Awards Presentation
Thursday May 22, 2025 1:00pm - 2:30pm CEST
Thursday May 22, 2025 1:00pm - 2:30pm CEST
Hall F ATM Studio Warsaw, Poland

2:45pm CEST

Students Welcome
Thursday May 22, 2025 2:45pm - 3:30pm CEST
Thursday May 22, 2025 2:45pm - 3:30pm CEST
Hall F ATM Studio Warsaw, Poland

3:00pm CEST

An in-situ perceptual evaluation of spatial audio in an automotive environment
Thursday May 22, 2025 3:00pm - 5:00pm CEST
Speakers
avatar for Bogdan Bacila

Bogdan Bacila

Postdoc, Institute of Sound and Vibration Research - University of Southampton
avatar for Filippo Fazi

Filippo Fazi

University of Southampton
Thursday May 22, 2025 3:00pm - 5:00pm CEST
Hall F ATM Studio Warsaw, Poland

3:00pm CEST

Automated soundstage tuning in cars
Thursday May 22, 2025 3:00pm - 5:00pm CEST
#N/A
Thursday May 22, 2025 3:00pm - 5:00pm CEST
Hall F ATM Studio Warsaw, Poland

3:00pm CEST

Comparing Artificially Created Acoustic Environments to Real Space Responses: Integrating Objective Metrics and Subjective Perceptual Listening Tests
Thursday May 22, 2025 3:00pm - 5:00pm CEST
This study evaluates the effectiveness of artificial reverberation algorithms that are used to create simulated acoustic environments by comparing them to the acoustic response of the real spaces. A mixed-methods approach, integrating objective and subjective measures, was employed to assess both the accuracy and perceptual quality of simulated acoustics. Real-world spaces, within a research project…, were selected for their varying sizes, functions, and acoustical properties. Objective acoustic measurements—such as Room Impulse Response (RIR), and extracted features i.e. Reverberation Time (RT60), Early Decay Time (EDT), Clarity index (C50, C80), and Definition (D50)—were conducted to establish baseline profiles. Simulated environments were created to replicate real-world conditions, incorporating source-receiver configurations, room geometries, and/or material properties. Objective metrics were extracted from these simulations for comparison with real-world data. After applying the artificial reverberation algorithm, the same objective measurements were re-recorded to assess its impact. Subjective listening tests were also conducted, with a diverse panel of listeners rating the perceived clarity, intelligibility, comfort, and overall sound quality of both real and simulated spaces, using a double-blind procedure to mitigate bias. Statistical analyses, including paired t-tests and correlation analysis, were performed to assess the relationship between objective and subjective evaluations. This approach provides a comprehensive framework for evaluating the algorithm’s ability to enhance simulated acoustics and align them with real-world environments.
Speakers
RK

Rigas Kotsakis

Aristotle University of Thessaloniki
avatar for Nikolaos Vryzas

Nikolaos Vryzas

Aristotle University Thessaloniki
Dr. Nikolaos Vryzas was born in Thessaloniki in 1990. He studied Electrical & Computer Engineering in the Aristotle University of Thessaloniki (AUTh). After graduating, he received his master degrees on Information and Communication Audio Video Technologies for Education & Production... Read More →
Thursday May 22, 2025 3:00pm - 5:00pm CEST
Hall F ATM Studio Warsaw, Poland

3:00pm CEST

Dynamic Diffuse Signal Processing (DiSP) as a Method of Decorrelating Early Reflections In Automobile Audio Systems
Thursday May 22, 2025 3:00pm - 5:00pm CEST
Automotive audio systems operate in highly reflective and acoustically challenging environments that differ significantly from optimized listening spaces such as concert halls or home theaters. The compact and enclosed nature of car cabins, combined with the presence of reflective surfaces—including the dashboard, windshield, and window, creates strong early reflections that interfere with the direct sound from loudspeakers. These reflections result in coherent interference, comb filtering, and position-dependent variations in frequency response, leading to inconsistent tonal balance, reduced speech intelligibility, and compromised stereo imaging and spatial localization. Traditional approaches, such as equalization and time alignment, attempt to compensate for these acoustic artifacts but do not effectively address coherence issues arising from coherent early reflections.
To mitigate these challenges, this study explores Dynamic Diffuse Signal Processing (DiSP) as an alternative solution for reducing early reflection coherence within automotive environments. DiSP is a convolution based signal processing technique that when implemented effectively, decorrelates coherent signals them while remaining perceptually identical. While this method has been successfully studied in sound reinforcement and multi-speaker environments, its application in automotive audio has not been extensively studied.
This research investigates the effectiveness of DiSP by analyzing pre- and post-DiSP impulse responses and frequency response variations at multiple listening positions. We assess its effectiveness in mitigating phase interference, reducing comb filtering. Experimental results indicate that DiSP significantly improves the uniformity of sound distribution, reducing spectral deviations across seating positions and minimizing unwanted artifacts caused by early reflections. These findings suggest that DiSP can serve as a powerful tool for optimizing in-car audio reproduction, offering a scalable and computationally efficient approach to improving listener experience in modern automotive sound systems.
Speakers
TS

Tommy Spurgeon

Physics Student & Undergraduate Researcher, University of South Carolina
Thursday May 22, 2025 3:00pm - 5:00pm CEST
Hall F ATM Studio Warsaw, Poland

3:00pm CEST

Perceptual Evaluation in varying Levels of Acoustic Detail in Multimodal Virtual Reality
Thursday May 22, 2025 3:00pm - 5:00pm CEST
#N/A
Speakers
HZ

Haowen Zhao

Univeristy of York
I am now working as an audio engineer with my research into 6 Degrees-of-Freedom (6DoF) audio for Virtual Reality (VR); this includes hybrid acoustic modelling methods for real-time calculation. I am currently looking at perceptual differences in different acoustic rendering methods... Read More →
DM

Damian Murphy

University of York
Thursday May 22, 2025 3:00pm - 5:00pm CEST
Hall F ATM Studio Warsaw, Poland

3:45pm CEST

Caudio Sponsored Session
Thursday May 22, 2025 3:45pm - 4:45pm CEST
Thursday May 22, 2025 3:45pm - 4:45pm CEST
Hall F ATM Studio Warsaw, Poland

6:00pm CEST

Heyser Lecture
Thursday May 22, 2025 6:00pm - 7:00pm CEST
Embarking on my professional journey as a young DSP engineer at Fraunhofer IIS in Erlangen, Germany, in 1989, I quickly encountered a profound insight that would shape my entire career in audio: audio is not merely data like any other set of numbers; its significance lies in how it sounds to us as human listeners. The sonic quality of audio signals cannot be captured by simple metrics like ‘signal-to-noise ratio.’ Instead, the true goal of any skilled audio engineer should be to enhance quality in ways that are genuinely perceptible through listening, rather than relying solely on mathematical diagnostics.



This foundational concept has been a catalyst for innovation throughout my career, from pioneering popular perceptual audio codecs like MP3 and AAC to exploring audio for VR/AR and AI-driven audio coding.



Join me in this lecture as I share my personal 36-year research journey, that led me to believe that in the world of media, it’s all about perception!
Thursday May 22, 2025 6:00pm - 7:00pm CEST
Hall F ATM Studio Warsaw, Poland
 
Friday, May 23
 

9:00am CEST

How to create and use audio for accessible video games?
Friday May 23, 2025 9:00am - 10:00am CEST
Sound is one of the most powerful tools for accessibility in video games, enabling players with visual impairments or cognitive disabilities to navigate, interact, and fully engage with the game world. This panel will explore how sound engineers can leverage audio design to enhance accessibility, making games more inclusive without compromising artistic intent. Experts from different areas of game development will discuss practical approaches, tools, and case studies that showcase how audio can bridge gaps in accessibility.

Discussion Topics:

• Why is sound crucial for accessibility in video games? Audio cues, spatial sound, and adaptive music can replace or complement visual elements, guiding players with disabilities through complex environments and interactions.
• Designing effective spatial audio for navigation and interaction. Using 3D audio and binaural rendering to provide players with intuitive sound-based navigation, enhancing orientation and gameplay flow for blind or visually impaired users.
• Audio feedback and sonification as key accessibility tools. Implementing detailed auditory feedback for in-game actions, menu navigation, and contextual cues to improve usability and player experience.
• Case studies of games with exemplary accessible audio design. Examining how games like The Last of Us Part II, BROK: The InvestiGator, and other titles have successfully integrated sound-based accessibility features.
• Tools and middleware solutions for accessible sound design (example: InclusivityForge). Showcasing how game engines and plugins such as InclusivityForge can streamline the implementation of accessibility-focused audio solutions.
• Challenges in designing accessible game audio and overcoming them. Addressing common technical and creative challenges when designing inclusive audio experiences, including balancing accessibility with immersive design.
• Future trends in accessibility-driven audio design. Exploring how AI, procedural sound, and new hardware technologies can push the boundaries of accessibility in interactive audio environments.

Panel Guests:

• Dr Joanna Pigulak - accessibility expert in games, researcher specializing in game audio accessibility, assistant professor at the Institute of Film, Media, and Audiovisual Arts at UAM.
• Tomasz Tworek - accessibility consultant, blind gamer, and audio design collaborator specializing in improving audio cues and sonification in video games.
• Dr Tomasz Żernicki - sound engineer, creator of accessibility-focused audio technologies for games, and founder of InclusivityForge.

Target Audience:

• Sound engineers and game audio designers looking to implement accessibility features in their projects.
• Game developers interested in leveraging audio as a tool for accessibility.
• UX designers and researchers focusing on sound-based interaction in gaming.
• Middleware and tool developers aiming to create better solutions for accessible audio design.
• Industry professionals seeking to align with accessibility regulations and best practices.

This panel discussion will explore how sound engineers can enhance game accessibility through innovative audio solutions, providing insights into the latest tools, design techniques, and industry best practices.
Speakers
avatar for Tomasz Żernicki

Tomasz Żernicki

co-founder, my3DAudio
Tomasz Zernicki is co-founder and former CEO of Zylia (www.zylia.co), an innovative company that provides tools for 3D audio recording and music production.Additionally, he is a founder of my3DAudio Ventures, whose goal is to scale audio companies that reach the MVP phase and want... Read More →
Friday May 23, 2025 9:00am - 10:00am CEST
Hall F ATM Studio Warsaw, Poland

9:30am CEST

Education & Career Fair
Friday May 23, 2025 9:30am - 11:30am CEST
Friday May 23, 2025 9:30am - 11:30am CEST
Hall F ATM Studio Warsaw, Poland

10:15am CEST

Fast facts on room acoustics
Friday May 23, 2025 10:15am - 11:45am CEST
If you are considering establishing a room for sound, i.e., recording, mixing, editing, listening, or even a room for live music, this is the crash course to attend!
Initially, we’ll walk through the essential considerations for any design of an acoustic space, (almost) no matter the purpose: Appropriate reverberation time, appropriate sound distribution, low background noise, no echoes/flutter echoes, appropriate control of early reflections, (and for stereo/surround/immersive: a degree of room symmetry).
To prevent misunderstandings, we must define the difference between room acoustics and building acoustics. This is a tutorial on room acoustics! Finding the right reverberation time for a project depends on the room's purpose. We’ll look into some relevant standards to find an appropriate target value and pay attention to the importance of the room's frequency balance, especially at low frequencies! We will take the starting point for calculation using Sabine’s equation and discuss the conditions to make it work.
The room's shape, the shape’s effect on room modes, and the distribution of the modes are mentioned (together with the term Schroeder Frequency). The acoustical properties of some conventional building materials and the consequences of choosing one in favor of another for the basic design are discussed. The membrane absorbers (plasterboard, plywood, gypsum board) and their importance in proper room design are presented here. This also involves the definition of absorption coefficients (and how to get them).
From the “raw” room and its properties, we move on to define the acoustic treatment to reach the target value. Again, the treatment often can be cheaper building materials. However, a lot of expensive specialized materials are also available. We’ll try to find a way through the jungle, keeping an eye on the spending. The tools typically are porous absorbers for the smaller rooms. Sometimes, resonance absorbers are used for larger rooms. We don’t want overkill of the high frequencies!
The placement of the sound sources in the room influences the perceived sound. A few basic rules are given. Elements to control the sound field are discussed: Absorption vs. diffusion. Some more uncomplicated principles for DYI diffusers are shown.
During the presentation, various practical solutions are presented. At the end of the tutorial, there will be some time for a minor Q&A.
Speakers
avatar for Eddy B. Brixen

Eddy B. Brixen

consultant, EBB-consult
Eddy B. Brixenreceived his education in electronic engineering from the Danish Broadcasting Corporation, the Copenhagen Engineering College, and the Technical University of Denmark. Major activities include room acoustics, electro-acoustic design, and audio forensics. He is a consultant... Read More →
Friday May 23, 2025 10:15am - 11:45am CEST
Hall F ATM Studio Warsaw, Poland

12:00pm CEST

Acoustic analysis on ancient stadia: from Circus Maximum of Rome to Hippodrome of Constantinople
Friday May 23, 2025 12:00pm - 1:30pm CEST
#N/A
Speakers
AB

Antonella Bevilacqua

University of Parma
Friday May 23, 2025 12:00pm - 1:30pm CEST
Hall F ATM Studio Warsaw, Poland

12:00pm CEST

Adaptive Room Acoustics Optimisation Using Virtual Microphone Techniques
Friday May 23, 2025 12:00pm - 1:30pm CEST
Room acoustics optimisation in live sound environments using signal processing techniques has captivated the minds of audio enthusiasts and researchers alike for over half a century. From analogue filters in the 1950s, to modern research efforts such as room impulse response equalisation and adaptive sound field control, this subject has exploded to life. Controlling the sound field in a static acoustic space is complex due to the high number of system variables, such as reflections, speaker crosstalk, equipment-induced coloration, room modes, reverberation, diffraction and listener positioning. These challenges are further amplified by dynamic variables such as audience presence, environmental conditions and room occupancy changes, which continuously and unpredictably reshape the sound field.
A primary objective of live sound reinforcement is to deliver uniform sound quality across the audience area. This is most critical at audience ear level, where tonal balance, clarity, and spatial imaging are most affected by variations in the sound field. While placing microphones at audience ear level positions could enable real-time monitoring, large-scale deployment is impractical due to audience interference.
This research will explore the feasibility of an adaptive virtual microphone-based approach to room acoustics optimisation. By strategically placing microphone arrays and leveraging virtual microphone technology, the system estimates the sound field dynamically at audience ear level without requiring physical microphones. By continuously repositioning focal points across listening zones, a small number of arrays could effectively monitor large audience areas. If accurate estimations can be achieved, real-time sound field control becomes more manageable and effective.
Speakers
avatar for Gavin Kearney

Gavin Kearney

Professor of Audio Engineering, University of York
Gavin Kearney graduated from Dublin Institute of Technology in 2002 with an Honors degree in Electronic Engineering and has since obtained MSc and PhD degrees in Audio Signal Processing from Trinity College Dublin. He joined the University of York as Lecturer in Sound Design in January... Read More →
Friday May 23, 2025 12:00pm - 1:30pm CEST
Hall F ATM Studio Warsaw, Poland

12:00pm CEST

Analysis of the Sound Pressure Level Distribution in the Low-Frequency Range Below the First Modal Frequency in Small Room Acoustics
Friday May 23, 2025 12:00pm - 1:30pm CEST
The occurrence of eigenmodes is one of the fundamental phenomena in the acoustics of small rooms. The modes formation results in an uneven distribution of the sound pressure level in the room. To determine the resonance frequencies and their distributions, numerical methods, analytical methods or experimental studies are used. For the purpose of this paper, an experimental study was carried out in a small room. The study analysed the results of measuring the sound pressure level distributions in the room, with a special focus on the frequency range 20 Hz - 32 Hz, below the first modal frequency in the room. The measurement were conducted in the rectangular grid 9x9 microphones, which resulted in 0.5 m microphones grid resolution. The influence of evanescent modes on the total sound field was investigated. The research takes into account several sound source locations. On the basis of the acoustic measurement carried out, frequency response curves were also plotted. This paper presents a few methods for analysing these curves based on standard deviation, the linear least squares method, coefficient of determination R^2 and root mean squared error (RMSE). The results obtained made it possible to determine the best position of the acoustic source in the room under study. The effect of evanescent modes on the total sound field was also observed.
Friday May 23, 2025 12:00pm - 1:30pm CEST
Hall F ATM Studio Warsaw, Poland

12:00pm CEST

Diffuse Signal Processing (DiSP) as a Method of Decorelating a Stereo Mix to Increase Mono Compatibility
Friday May 23, 2025 12:00pm - 1:30pm CEST
Mono compatibility is a fundamental challenge in audio production, ensuring that stereo mixes retain clarity, balance, and spectral integrity when summed to mono. Traditional stereo widening techniques often introduce phase shifts, comb filtering, and excessive decorrelation, causing perceptual loss of critical mix elements in mono playback. Diffuse Signal Processing (DiSP) is introduced as a convolution-based method that improves mono compatibility while maintaining stereo width.

This study investigates the application of DiSP to the left and right channels of a stereo mix, leveraging MATLAB-synthesized TDI responses to introduce spectrally balanced, non-destructive acoustic energy diffusion. TDI convolution is then applied to both the left and right channels of the final stereo mix.

A dataset of stereo mixes from four genres (electronic, heavy metal, orchestral, and pop/rock) was analyzed. The study evaluated phase correlation, mono-summed frequency response deviation and amount of comb filtering to quantify improvements in mono summation. Spectral plots and wavelet transforms provided objective analysis. Results demonstrated that DiSP reduced phase cancellation, significantly decreased comb filtering artifacts, and improved spectral coherence in mono playback while preserving stereo width within the original mix. Applying this process to the final left and right channels allows an engineer to mix freely without the concern of the mono mix’s compatibility.

DiSP’s convolution-based approach offers a scalable, adaptive solution for modern mixing and mastering workflows, overcoming the limitations of traditional stereo processing. Future research includes machine learning-driven adaptive DiSP, frequency-dependent processing enhancements, and expansion to spatial audio formats (5.1, 7.1, Dolby Atmos) to optimize mono downmixing. The findings confirm DiSP as a robust and perceptually transparent method for improving mono compatibility without compromising stereo imaging.
Speakers
TS

Tommy Spurgeon

Physics Student & Undergraduate Researcher, University of South Carolina
Friday May 23, 2025 12:00pm - 1:30pm CEST
Hall F ATM Studio Warsaw, Poland

12:00pm CEST

Instantaneous Low-frequency Energetic Analysis for Detection of Standing Waves
Friday May 23, 2025 12:00pm - 1:30pm CEST
Standing waves are a phenomenon ever-present in the reproduction of low frequencies and have a direct impact on the auditory perception of this frequency region.
This study addresses the challenges posed by standing waves which are difficult to measure accurately using conventional pressure microphones, due to their spatial and temporal characteristics. To combat these issues, a state-of-the-art sound pressure velocity probe specifically designed for measurement of intensity in the low-frequency spectrum is developed. Using this probe, the research includes the development of new energy estimation parameters to better quantify the characteristics of sound fields influenced by standing waves. Additionally, a novel "standing-wave-ness" parameter is proposed, based on two diffuseness quantities dealing with the proportion of locally confined energy and the temporal variation of the intensity vectors. The performance of the new method and probe is evaluated through both simulated and real-world measurement data. Simulations provide a controlled environment to assess the method's accuracy across a variety of scenarios, including both standing wave and non-standing wave conditions. These initial simulations are followed by validation through measurement data obtained from an anechoic chamber, ensuring that the method's capabilities are tested in highly controlled, close-to-real-world settings. Preliminary results from this dual approach show promising potential for the new method to quantify the presence of standing waves, adding a new dimension in the visualisation and understanding of low-frequency phenomena.
Speakers
avatar for Madalina Nastasa

Madalina Nastasa

Doctoral Researcher, Aalto University
Doctoral researcher at the Acoustics Lab of Aalto University passionate about everything audio. My research focuses on the human perception of the very low frequency spectrum, and so does my day to day life. When I am not in the Acoustics lab, I organise electronic music events where... Read More →
avatar for Aki Mäkivirta

Aki Mäkivirta

R&D Director, Genelec Oy
Aki Mäkivirta is R&D Director at Genelec, Iisalmi, Finland, and has been with Genelec since 1995. He received his Master of Science, Licentiate of Science, and Doctor of Science in Technology degrees from Tampere University of Technology, in 1985, 1989, and 1992, respectively.  Aki... Read More →
Friday May 23, 2025 12:00pm - 1:30pm CEST
Hall F ATM Studio Warsaw, Poland

1:45pm CEST

A Testbed for Detecting DeepFake Audio
Friday May 23, 2025 1:45pm - 3:45pm CEST
The rapid advancement of generative artificial intelligence has created highly realistic DeepFake multimedia content, posing significant challenges for digital security and authenticity verification. This paper presents the development of a comprehensive testbed designed to detect counterfeit audio content generated by DeepFake techniques. The proposed framework integrates forensic spectral analysis, numerical and statistical modeling, and machine learning-based detection to assess the authenticity of multimedia samples. Our study evaluates various detection methodologies, including spectrogram comparison, Euclidean distance-based analysis, pitch modulation assessment, and spectral flatness deviations. The results demonstrate that cloned and synthetic voices exhibit distinctive acoustic anomalies, with forensic markers such as pitch mean absolute error and power spectral density variations serving as effective indicators of manipulation. By systematically analyzing human, cloned, and synthesized voices, this research provides a foundation for advancing DeepFake detection strategies. The proposed testbed offers a scalable and adaptable solution for forensic audio verification, contributing to the broader effort of safeguarding multimedia integrity in digital environments.
Friday May 23, 2025 1:45pm - 3:45pm CEST
Hall F ATM Studio Warsaw, Poland

1:45pm CEST

An audio quality metrics toolbox for media assets management, content exchange, and dataset alignment
Friday May 23, 2025 1:45pm - 3:45pm CEST
Content exchange and collaboration serve as catalysts for repository creation that supports creative industries and fuels model development in machine learning and AI. Despite numerous repositories, challenges persist in discoverability, rights preservation, and efficient reuse of audiovisual assets. To address these issues, the SCENE (Searchable multi-dimensional Data Lakes supporting Cognitive Film Production & Distribution for the Promotion of the European Cultural Heritage) project introduces an automated audio quality assessment toolkit integrated within its Media Assets Management (MAM) platform. This toolkit comprises a suite of advanced metrics, such as artifact detection, bandwidth estimation, compression history analysis, noise profiling, speech intelligibility, environmental sound recognition, and reverberation characterization. The metrics are extracted using dedicated Flask-based web services that interface with a data lake architecture. By streamlining the inspection of large-scale audio repositories, the proposed solution benefits both high-end film productions and smaller-scale collaborations. The pilot phase of the toolkit will involve professional filmmakers who will provide feedback to refine post-production workflows. This paper presents the motivation, design, and implementation details of the toolkit, highlighting its potential to assess content quality management and contribute to more efficient content exchange in the creative industries.
Speakers
avatar for Nikolaos Vryzas

Nikolaos Vryzas

Aristotle University Thessaloniki
Dr. Nikolaos Vryzas was born in Thessaloniki in 1990. He studied Electrical & Computer Engineering in the Aristotle University of Thessaloniki (AUTh). After graduating, he received his master degrees on Information and Communication Audio Video Technologies for Education & Production... Read More →
IT

Iordanis Thoidis

Aristotle University of Thessaloniki
LV

Lazaros Vrysis

Aristotle University of Thessaloniki
Friday May 23, 2025 1:45pm - 3:45pm CEST
Hall F ATM Studio Warsaw, Poland

1:45pm CEST

Application for Binaural Audio Plays: Development of Auditory Perception and Spatial Orientation
Friday May 23, 2025 1:45pm - 3:45pm CEST
When navigating the environment, we primarily rely on sight. However, in its absence, individuals must develop precise spatial awareness using other senses. A blind person can recognize their immediate surroundings through touch, but assessing larger spaces requires auditory perception.
This project presents a method for auditory training in children with visual disabilities through structured audio plays designed to teach spatial pronouns and enhance spatial orientation via auditory stimuli. The format and structure of these audio plays allow for both guided learning with a mentor and independent exploration. Binaural recordings serve as the core component of the training exercises. The developed audio plays and their analyses are available on the YouTube platform in the form of videos and interactive exercises.
The next step of this project involves developing an application that enables students to create individual accounts and track their progress. Responses collected during exercises will help assess the impact of the audio plays on students, facilitating improvements and modifications to the training materials.
Additionally, linking vision-related questions with responses to auditory exercises will, over time, provide insights into the correlation between these senses. The application can serve multiple purposes: collecting research data, offering spatial recognition and auditory perception training, and creating a comprehensive, structured environment for auditory skill development.
Friday May 23, 2025 1:45pm - 3:45pm CEST
Hall F ATM Studio Warsaw, Poland

1:45pm CEST

Exploring the Process of Interconnected Procedurally Generated Visual and Audial Content
Friday May 23, 2025 1:45pm - 3:45pm CEST
This paper investigates the innovative synthesis of procedurally generated visual and auditory content through the use of Artificial Intelligence (AI) Tools, specifically focusing on Generative Pre-Trained Transformer (GPT) networks.
This research explores the process of procedurally generating an audiovisual representations of semantic context by generating images, artificially providing motion and generating corresponding multilayered sound. The process enables the generation of stopped-motion audiovisual representations of concepts.
This approach not only highlights the capacity for Generative AI to produce cohesive and semantically rich audiovisual media but also delves into the interconnections between visual art, music, sonification, and computational creativity. By examining the synergy between generated imagery and corresponding soundscapes, this research paper aims to uncover new insights into the aesthetic and technical implications of the use of AI in art.
This research embodies a direct application of AI technology across multiple disciplines creating intermodal media. Research findings propose a novel framework for understanding and advancing the use of AI in the creative processes, suggesting potential pathways for future interdisciplinary research and artistic expression.
Through this work, this study contributes to the broader discourse on the role of AI in enhancing creative practices, offering perspectives on how various modes of semantic representation can be interleaved using state-of-the-art technology.
Speakers
Friday May 23, 2025 1:45pm - 3:45pm CEST
Hall F ATM Studio Warsaw, Poland

1:45pm CEST

G.A.D.A.: Guitar Audio Dataset for AI - An Open-Source Multi-Class Guitar Corpus
Friday May 23, 2025 1:45pm - 3:45pm CEST
We present G.A.D.A. (Guitar Audio Dataset for AI), a novel open-source dataset designed for advancing research in guitar audio analysis, signal processing, and machine learning applications. This comprehensive corpus comprises recordings from three main guitar categories: electric, acoustic, and bass guitars, featuring multiple instruments within each category to ensure dataset diversity and robustness.

The recording methodology employs two distinct approaches based on instrument type. Electric and bass guitars were recorded using direct recording techniques via DI boxes, providing clean, unprocessed signals ideal for further digital processing and manipulation. For acoustic guitars, where direct recording was not feasible, we utilized multiple microphone configurations at various positions to capture the complete acoustic properties of the instruments. Both recording approaches prioritize signal quality while maintaining maximum flexibility for subsequent processing and analysis.

The dataset includes standardized recordings of major and minor chords played in multiple positions and voicings across all instruments. Each recording is accompanied by detailed metadata, including instrument specifications, recording equipment details, microphone configurations (for acoustic guitars), and chord information. The clean signals from electric instruments enable various post-processing applications, including virtual amplifier modeling, effects processing, impulse response convolution, and room acoustics simulation.

To evaluate G.A.D.A.'s effectiveness in machine learning applications, we propose a comprehensive testing framework using established algorithms including k-Nearest Neighbors, Support Vector Machines, Convolutional Neural Networks, and Feed-Forward Neural Networks. These experiments will focus on instrument classification tasks using both traditional audio features and deep learning approaches.

G.A.D.A. will be freely available for academic and research purposes, complete with documentation, preprocessing scripts, example code, and usage guidelines. This resource aims to facilitate research in musical instrument classification, audio signal processing, deep learning applications in music technology, computer-aided music education, and automated music transcription systems.

The combination of standardized recording methodologies, comprehensive metadata, and the inclusion of both direct-recorded and multi-microphone captured audio makes G.A.D.A. a valuable resource for comparative studies and reproducible research in music information retrieval and audio processing.
Friday May 23, 2025 1:45pm - 3:45pm CEST
Hall F ATM Studio Warsaw, Poland

4:00pm CEST

Binamix - A Python Library for Generating Binaural Audio Datasets
Friday May 23, 2025 4:00pm - 6:00pm CEST
The increasing demand for spatial audio in applications such as virtual reality, immersive media, and spatial audio research necessitates robust solutions for binaural audio dataset generation for testing and validation. Binamix is an open-source Python library designed to facilitate programmatic binaural mixing using the extensive SADIE II Database, which provides HRIR and BRIR data for 20 subjects. The Binamix library provides a flexible and repeatable framework for creating large-scale spatial audio datasets, making it an invaluable resource for codec evaluation, audio quality metric development, and machine learning model training. A range of pre-built example scripts, utility functions, and visualization plots further streamline the process of custom pipeline creation. This paper presents an overview of the library's capabilities, including binaural rendering, impulse response interpolation, and multi-track mixing for various speaker layouts. The tools utilize a modified Delaunay triangulation technique to achieve accurate HRIR/BRIR interpolation where desired angles are not present in the data. By supporting a wide range of parameters such as azimuth, elevation, subject IRs, speaker layouts, mixing controls, and more, the library enables researchers to create large binaural datasets for any downstream purpose. Binamix empowers researchers and developers to advance spatial audio applications with reproducible methodologies by offering an open-source solution for binaural rendering and dataset generation.
Speakers
avatar for Jan Skoglund

Jan Skoglund

Google
Jan Skoglund leads a team at Google in San Francisco, CA, developing speech and audio signal processing components for capture, real-time communication, storage, and rendering. These components have been deployed in Google software products such as Meet and hardware products such... Read More →
Friday May 23, 2025 4:00pm - 6:00pm CEST
Hall F ATM Studio Warsaw, Poland

4:00pm CEST

Neural 3D Audio Renderer for acoustic digital twin creation
Friday May 23, 2025 4:00pm - 6:00pm CEST
In this work, we introduce a Neural 3D Audio Renderer (N3DAR) - a conceptual solution for creating acoustic digital twins of arbitrary spaces. We propose a workflow that consists of several stages including:
1. Simulation of high-fidelity Spatial Room Impulse Responses (SRIR) based on the 3D model of a digitalized space,
2. Building an ML-based model of this space for interpolation and reconstruction of SRIRs,
3. Development of a real-time 3D audio renderer that allows the deployment of the digital twin of a space with accurate spatial audio effects consistent with the actual acoustic properties of this space.
The first stage consists of preparation of the 3D model and running the SRIR simulations using the state-of-the-art wave-based method for arbitrary pairs of source-receiver positions. This stage provides a set of learning data being used in the second stage - training the SRIR reconstruction model. The training stage aims to learn the model of the acoustic properties of the digitalized space using the Acoustic Volume Rendering approach (AVR). The last stage is the construction of a plugin with a dedicated 3D audio renderer where rendering comprises reconstruction of the early part of the SRIR, estimation of the reverb part, and HOA-based binauralization.
N3DAR allows the building of tailored audio rendering plugins that can be deployed along with visual 3D models of digitalized spaces, where users can freely navigate through the space with 6 degrees of freedom and experience high-fidelity binaural playback in real time.
We provide a detailed description of the challenges and considerations for each of the stages. We also conduct an extensive evaluation of the audio rendering capabilities with both, objective metrics and subjective methods using a dedicated evaluation platform.
Friday May 23, 2025 4:00pm - 6:00pm CEST
Hall F ATM Studio Warsaw, Poland

4:00pm CEST

Performance Estimation Method for 3D Microphone Array based on the Modified Steering Vector in Spherical Harmonic Domain
Friday May 23, 2025 4:00pm - 6:00pm CEST
This paper presents an objective method for estimating the performance of 3D microphone arrays, which is also applicable to 2D arrays. The method incorporates the physical characteristics and relative positions of the microphones, merging these elements through a weighted summation to derive the arrays' directional patterns. These patterns are represented as a "Modified Steering Vector." Additionally, leveraging the spatial properties of spherical harmonics, we transform the array's directional pattern into the spherical harmonic domain. This transformation enables a quantitative analysis of the physical properties of each component, providing a comprehensive understanding of the array's performance. Overall, the proposed method offers a deeply insightful and versatile framework for evaluating the performance of both 2D and 3D microphone arrays by fully exploiting their inherent physical characteristics.
Friday May 23, 2025 4:00pm - 6:00pm CEST
Hall F ATM Studio Warsaw, Poland

4:00pm CEST

Reconstructing Sound Fields with Physics-Informed Neural Networks: Applications in Real-World Acoustic Environments
Friday May 23, 2025 4:00pm - 6:00pm CEST
The reconstruction of sound fields is a critical component in a range of applications, including spatial audio for augmented, virtual, and mixed reality (AR/VR/XR) environments, as well as for optimizing acoustics in physical spaces. Traditional approaches to sound field reconstruction predominantly rely on interpolation techniques, which estimate sound fields based on a limited number of spatial and temporal measurements. However, these methods often struggle with issues of accuracy and realism, particularly in complex and dynamic environments. Recent advancements in deep learning have provided promising alternatives, particularly with the introduction of Physics-Informed Neural Networks (PINNs), which integrate physical laws directly into the model training process. This study aims to explore the application of PINNs for sound field reconstruction, focusing on the challenge of predicting acoustic fields in unmeasured areas. The experimental setup involved the collection of impulse response data from the Promenadikeskus concert hall in Pori, Finland, using various source and receiver positions. The PINN framework is then utilized to simulate the hall’s acoustic behavior, with parameters incorporated to model sound propagation across different frequencies and source-receiver configurations. Despite challenges arising from computational load, pre-processing strategies were implemented to optimize the model's efficiency. The results demonstrate that PINNs can accurately reconstruct sound fields in complex acoustic environments, offering significant potential for real-time sound field control and immersive audio applications.
Speakers
RK

Rigas Kotsakis

Aristotle University of Thessaloniki
IT

Iordanis Thoidis

Aristotle University of Thessaloniki
avatar for Nikolaos Vryzas

Nikolaos Vryzas

Aristotle University Thessaloniki
Dr. Nikolaos Vryzas was born in Thessaloniki in 1990. He studied Electrical & Computer Engineering in the Aristotle University of Thessaloniki (AUTh). After graduating, he received his master degrees on Information and Communication Audio Video Technologies for Education & Production... Read More →
LV

Lazaros Vrysis

Aristotle University of Thessaloniki
Friday May 23, 2025 4:00pm - 6:00pm CEST
Hall F ATM Studio Warsaw, Poland

4:00pm CEST

Recording and post-production of Dietrich Buxtehude baroque cantatas in stereo and Dolby Atmos using experimental 3D microphone array.
Friday May 23, 2025 4:00pm - 6:00pm CEST
3D recordings seem to be an attractive solution when trying to achieve the immersion effect. Recently, Dolby Atmos is an increasingly popular format for distributing three-dimensional music recordings. Although currently the main format for producing music recordings is still stereophony.

How to optimally extend traditional microphone techniques when recording classical music to obtain both stereo recordings and three-dimensional formats (e.g. Dolby Atmos) in the post-production process? The author is trying to answer this question using the example of a recording of Dietrich Buxtehude work "Membra Jesu Nostri", BuxWV 75. The cycle of seven cantatas composed in 1680 is one of the most important and most popular compositions of the early Baroque era. The first Polish recording was made by the Arte Dei Suonatori conducted by Bartłomiej Stankowiak, accompanied by soloists and choral parts performed by the choir Cantus Humanus.

The author will present his concept of a set of microphones for 3D recordings. In addition to the detailed setup of microphones, it will cover the method of post-production of the recording, combining stereo with a mix of the recording into the Dolby Atmos system in a 7.2.4 speaker configuration. A workflow will be proposed to facilitate the change between different formats.
Friday May 23, 2025 4:00pm - 6:00pm CEST
Hall F ATM Studio Warsaw, Poland

4:00pm CEST

Subjective Evaluation on Three-dimensional VBAP and Ambisonics in an Immersive Concert Setting
Friday May 23, 2025 4:00pm - 6:00pm CEST
This paper investigates the subjective evaluation of two prominent three-dimensional spatialization techniques—Vector Base Amplitude Panning (VBAP) and High-Order Ambisonics (HOA)—using IRCAM’s Spat in an immersive concert setting. The listening test was conducted in the New Hall at the Royal Danish Academy of Music, which features a 44-speaker immersive audio system. The musical stimuli included electronic compositions and modern orchestral recordings, providing a diverse range of temporal and spectral content. The participants comprised experienced Tonmeisters and non-experienced musicians, who were seated in off-center positions to simulate real-world audience conditions. This study provides an ecologically valid subjective evaluation methodology.
The results indicated that VBAP excelled in spatial clarity and sound quality, while HOA demonstrated superior envelopment. The perceptual differences between the two techniques were relatively minor, influenced by room acoustics and suboptimal listening positions. Furthermore, music genre had no significant impact on the evaluation outcomes.
The study highlights VBAP’s strength in precise localization and HOA's capability for creating immersive soundscapes, aiming to bridge the gap between ideal and real-world applications in immersive sound reproduction and perception. The findings suggest the need to balance trade-offs when selecting spatialization techniques for specific purposes, venues, and audience positions. Future research will focus on evaluating a wider range of spatialization methods in concert environments and optimizing them to improve the auditory experience for distributed audiences.
Speakers
avatar for Jesper Andersen

Jesper Andersen

Head of Tonmeister Programme, Det Kgl Danske Musikkonservatorium
As a Grammy-nominated producer, engineer and pianist Jesper has recorded around 100 CDs and produced music for radio, TV, theatre, installations and performance. Jesper has also worked as a sound engineer/producer at the Danish Broadcasting Corporation.A recent album-production is... Read More →
avatar for Stefania Serafin

Stefania Serafin

Professor, Aalborg University Copenhagen
I am Professor in Sonic interaction design at Aalborg University in Copenhagen and leader of the Multisensory Experience Labtogether with Rolf Nordahl.I am the President of the Sound and Music Computing association, Project Leader of the Nordic Sound and Music Computing netwo... Read More →
Friday May 23, 2025 4:00pm - 6:00pm CEST
Hall F ATM Studio Warsaw, Poland

4:00pm CEST

Visualization of the spatial behavior between channels in surround program
Friday May 23, 2025 4:00pm - 6:00pm CEST
#N/A
Speakers
Friday May 23, 2025 4:00pm - 6:00pm CEST
Hall F ATM Studio Warsaw, Poland

4:30pm CEST

Ask Us Anything About Starting Your Career
Friday May 23, 2025 4:30pm - 6:00pm CEST
Join a panel of professionals from a variety of fields in the industry as we discuss topics including how to enter the audio industry, how they each got started in their own careers and the path their careers took, and give advice geared towards students and recent graduates. Bring your questions for the panelists – most of this workshop will be focused the information YOU want to hear!
Speakers
avatar for Ian Corbett

Ian Corbett

Coordinator & Professor, Audio Engineering & Music Technology, Kansas City Kansas Community College
Dr. Ian Corbett is the Coordinator and Professor of Audio Engineering and Music Technology at Kansas City Kansas Community College. He also owns and operates off-beat-open-hats LLC, providing live sound, recording, and audio production services to clients in the Kansas City area... Read More →
Friday May 23, 2025 4:30pm - 6:00pm CEST
Hall F ATM Studio Warsaw, Poland
  Audio in education
 
Saturday, May 24
 

9:00am CEST

Tutorial Workshop: The Gentle Art of Dithering
Saturday May 24, 2025 9:00am - 10:45am CEST
This tutorial is for everyone working on the design or production of digital audio and should benefit beginners and experts. We aim to bring this topic to life with several interesting audio demonstrations, and up to date with new insights and some surprising results that may reshape pre-conceptions of high resolution.
In a recent paper, we stressed that transparency (high-resolution audio fidelity) depends on the preservation of micro-sounds – those small details that are easily lost to quantization errors, but which can be perfectly preserved by using the right dither.
It is often asked: ‘Why should I add noise to my recording?’ or, ‘How can adding noise make things clearer?’ This tutorial gives a tour through these questions and presents a call to action: dither should not be looked on as an added noise, but an essential lubricant to preserves naturalness.

Tutorial topics include: fundamentals of dithering; analysis using histograms and synchronous averaging; what happens if undithered quantizers are cascaded?; ‘washboard distortion’; noise-shaping; additive and subtractive dither; time-domain effects; inside A/D and D/A converters; the perilous world of modern signal chains (including studio workflow and DSP in fixed and floating-point processors) and, finally, audibility analysis.
Saturday May 24, 2025 9:00am - 10:45am CEST
Hall F ATM Studio Warsaw, Poland

9:00am CEST

Extension of Reflection-Free Region for Loudspeaker Measurements
Saturday May 24, 2025 9:00am - 10:45am CEST
If loudspeaker measurements are carried out elevated over a flat, very reflective surface with no nearby obstacles, the recovered impulse response will contain the direct response and one clean delayed reflection. Many loudspeakers are omnidirectional at low frequencies, having a clear acoustic centre, and this reflection will have a low-frequency behaviour that is essentially the same as its direct response, except the amplitude will be down by a 1/r factor. We derive a simple algorithm that iteratively allows this reflection to be cancelled, so that the response of the loudspeaker will be valid to lower frequencies than before, complementing the usual high-frequency response obtained from simple time-truncation of the impulse response. The method is explained, discussed, and illustrated with a two-way system measured over a flat, sealed driveway surface.
Speakers
Saturday May 24, 2025 9:00am - 10:45am CEST
Hall F ATM Studio Warsaw, Poland

9:00am CEST

Impact of Voice-Coil Temperature on Electroacoustic Parameters for Optimized Loudspeaker Enclosure Design in Small-Signal Response
Saturday May 24, 2025 9:00am - 10:45am CEST
The study of electroacoustic parameters in relation to loudspeaker temperature has predominantly focused on large-signal conditions (i.e., high-power audio signals), with limited attention to their behavior under small-signal conditions at equivalent thermal states. This research addresses this gap by investigating the influence of voice-coil temperature on electroacoustic parameters during small-signal operation. The frequency response of the electrical input impedance and the radiated acoustic pressure were measured across different voice-coil temperatures. The results revealed temperature-dependent shifts across all parameters, including the natural frequency in free air (fₛ), mechanical quality factor (Qₘₛ), electrical resistance (Rₑ), electrical inductance (Lₑ), and equivalent compliance volume (Vₐₛ), among others. Specifically, Rₑ and Lₑ increased linearly with temperature, while fₛ decreased and Vₐₛ increased following power-law functions. These changes suggest that thermal effects influence both electrical and mechanical subsystems, potentially amplified by the viscoelastic “creep” effect inherent to loudspeaker suspensions. Finally, simulations of sealed and bandpass enclosures demonstrated noticeable shifts in acoustic performance under thermal variations, emphasizing the importance of considering temperature effects in enclosure design.
Saturday May 24, 2025 9:00am - 10:45am CEST
Hall F ATM Studio Warsaw, Poland

9:00am CEST

Material Characterization and Variability in Loudspeaker Membranes for Acoustic Modeling
Saturday May 24, 2025 9:00am - 10:45am CEST
Finite Element Method (FEM) simulations are vital in the design of loudspeakers, offering a more efficient alternative to traditional trial-and-error approaches. Precise material characterization, however, is essential in ensuring that theoretical models align closely with measurements. Variations in material properties, particularly those of a loudspeaker’s membrane, can significantly influence loudspeaker performance. This work aims to establish a methodology for evaluating the variability of loudspeaker membrane materials, specifically cones and surrounds, to better understand each materials repeatability among samples, and overall improve the precision and reliability of loudspeaker simulations.


The study first conducts an in-depth analysis of membrane materials, focusing on their Young’s modulus and density, by utilizing both empirical and simulated data. Subsequently, complete loudspeakers were built and investigated, utilizing membranes studied. A FEM simulation framework is presented, and observations are made into discrepancies between measured and simulated loudspeaker responses at specific frequencies and their relation to material modeling.

The results demonstrated significant alignment between simulations and real-life performances, showing interesting insights into the impact of small changes in material properties on the acoustic response of a loudspeaker. One significant finding was the frequency dependence of the Young’s modulus of fiberglass used for a cone. Further validation can be achieved by expanding the dataset of the materials measured, exploring more materials, and under varying conditions such as temperature and humidity. Such insights enable more accurate modeling of loudspeakers and lay the groundwork for exploring novel materials with enhanced acoustic properties, guiding the development of high-performance loudspeakers.
Speakers
avatar for Chiara Corsini

Chiara Corsini

R&D engineer, FAITAL [ALPS ALPINE]
Chiara has joined Faital S.p.A. in 2018, working as a FEM analyst in the R&D Department. Her research activities are focused on thermal phenomena associated with loudspeaker functioning, and mechanical behavior of the speaker moving parts. To this goal, she uses FEM and lumped parameter... Read More →
LV

Luca Villa

FAITAL [ALPS ALPINE]
RT

Romolo Toppi

FAITAL [ALPS ALPINE]
Saturday May 24, 2025 9:00am - 10:45am CEST
Hall F ATM Studio Warsaw, Poland

9:00am CEST

Shape Optimization of Waveguides for Improving the Directivity of Soft Dome Tweeters
Saturday May 24, 2025 9:00am - 10:45am CEST
This paper introduces a new algorithm for multiposition mixed-phase equalization of slot-loaded loudspeaker responses obtained in the horizontal and vertical plane, using finite impulse response (FIR) filters. The algorithm selects a {\em prototype response} that yields a filter that best optimizes a time-domain-based objective metric for equalization for a given direction. The objective metric includes a weighted linear combination of pre-ring energy, early and late reflection energy, and decay rate (characterizing impulse response shortening) during filter synthesis. The results show that the presented mixed-phase multiposition filtering algorithm performs a good equalization along all horizontal directions and for most positions in the vertical direction. Beyond the multiposition filtering capabilities, the algorithm and the metric are suitable for designing mixed-phase filters with low delays, an essential constraint for real-time processing.
Saturday May 24, 2025 9:00am - 10:45am CEST
Hall F ATM Studio Warsaw, Poland

9:00am CEST

Supervised Machine Learning for Quality Assurance in Loudspeakers: Time Distortion Analysis
Saturday May 24, 2025 9:00am - 10:45am CEST
Measuring a speaker’s ability to respond to an instantaneous pulse of energy will result in distortion at its output. Factors such as speaker geometry, material properties, equipment error, and the conditions of the environment will create artifacts within the captured data. This paper explores the extraction of time-domain features from these responses, and the training of a predictive model to allow for classification and rapid quality assurance.
Speakers
Saturday May 24, 2025 9:00am - 10:45am CEST
Hall F ATM Studio Warsaw, Poland

11:00am CEST

Loudness of movies for Broadcasting
Saturday May 24, 2025 11:00am - 12:00pm CEST
Broadcasting movies in linear TV or via streaming presents a considerable challenge, especially for highly dynamic content like action films. Normalising such content to the paradigm of "Programme Loudness" may result in dialogue levels much lower than the loudness reference level (-23 LUFS in Europe). On the other hand, normalising to the dialogue level may lead to overly loud sound effects. The EBU Loudness group PLOUD has addressed this issue with the publication of R 128 s4, the forth supplement to the core recommendation R 128. In order to have a better understanding of the challenge, an extensive analysis of 44 dubbed movies (mainly Hollywood mainstream films) has been conducted. These analysed films were already dynamically treated for broadcast delivery by experienced sound engineers. The background of the latest document of the PLOUD group will be presented and the main parameter LDR (Loudness-to-Dialogue-Ratio) will be introduced. A systematic approach when and how to proceed with dynamic treatment will be included.
Speakers
avatar for Florian Camerer

Florian Camerer

Senior Sound Engineer, ORF
Saturday May 24, 2025 11:00am - 12:00pm CEST
Hall F ATM Studio Warsaw, Poland

11:00am CEST

Students Project Expo
Saturday May 24, 2025 11:00am - 1:00pm CEST
Saturday May 24, 2025 11:00am - 1:00pm CEST
Hall F ATM Studio Warsaw, Poland

1:45pm CEST

Be A Leader!
Saturday May 24, 2025 1:45pm - 2:45pm CEST
Have you ever wondered how AES works? Let's meet up and talk about the benefits of volunteering and the path to leadership in AES! You could be our next Chair, Vice President, or even AES President!
Speakers
avatar for Leslie Gaston-Bird

Leslie Gaston-Bird

President, Audio Engineering Society
Dr. Leslie Gaston-Bird (AMPS, MPSE) is President of the Audio Engineering Society and author of the books "Women in Audio", part of the AES Presents series and published by Focal Press (Routledge); and Math for Audio Majors (A-R Editions). She is a voting member of the Recording Academy... Read More →
Saturday May 24, 2025 1:45pm - 2:45pm CEST
Hall F ATM Studio Warsaw, Poland

3:00pm CEST

Convention Closing Ceremony
Saturday May 24, 2025 3:00pm - 4:30pm CEST
Saturday May 24, 2025 3:00pm - 4:30pm CEST
Hall F ATM Studio Warsaw, Poland
 


Share Modal

Share this link via

Or copy link

Filter sessions
Apply filters to sessions.
  • Acoustic Transducers & Measurements
  • Acoustics
  • Acoustics of large performance or rehearsal spaces
  • Acoustics of smaller rooms
  • Acoustics of smaller rooms Room acoustic solutions and materials
  • Acoustics & Sig. Processing
  • AI
  • AI & Machine Audition
  • Analysis and synthesis of sound
  • Archiving and restoration
  • Audio and music information retrieval
  • Audio Applications
  • Audio coding and compression
  • Audio effects
  • Audio Effects & Signal Processing
  • Audio for mobile and handheld devices
  • Audio for virtual/augmented reality environments
  • Audio formats
  • Audio in Education
  • Audio perception
  • Audio quality
  • Auditory display and sonification
  • Automotive Audio
  • Automotive Audio & Perception
  • Digital broadcasting
  • Electronic dance music
  • Electronic instrument design & applications
  • Evaluation of spatial audio
  • Forensic audio
  • Game Audio
  • Generative AI for speech and audio
  • Hearing Loss Protection and Enhancement
  • High resolution audio
  • Hip-Hop/R&B
  • Impact of room acoustics on immersive audio
  • Instrumentation and measurement
  • Interaction of transducers and the room
  • Interactive sound
  • Listening tests and evaluation
  • Live event and stage audio
  • Loudspeakers and headphones
  • Machine Audition
  • Microphones converters and amplifiers
  • Microphones converters and amplifiers Mixing remixing and mastering
  • Mixing remixing and mastering
  • Multichannel and spatial audio
  • Music and speech signal processing
  • Musical instrument design
  • Networked Internet and remote audio
  • New audio interfaces
  • Perception & Listening Tests
  • Protocols and data formats
  • Psychoacoustics
  • Room acoustics and perception
  • Sound design and reinforcement
  • Sound design/acoustic simulation of immersive audio environments
  • Spatial Audio
  • Spatial audio applications
  • Speech intelligibility
  • Studio recording techniques
  • Transducers & Measurements
  • Wireless and wearable audio